All Downloads are FREE. Search and download functionalities are using the official Maven repository.

com.imsweb.validation.package.html Maven / Gradle / Ivy


This package contains the classes related to the Validation Engine; see the main class {@link com.imsweb.validation.ValidationEngine}.


Here is a full (although simplistic) example of how the engine can be used to validate a NAACCR file using the SEER edits (the imports have been omitted):
    public static void main(String[] args) throws Exception {

        // location of the SEER edits XML file and the fake NAACCR data file
        File editsFile = new File("C:\\seer-edits.xml");
        File dataFile = new File("C:\\fake-naaccr-5-recs.txt");

        // SEER*Utils uses log4j to log messages; we don't want a log file to be created, we just want the messages
        // at the console, so let's change the default behavior of log4j...
        Logger.getRootLogger().removeAllAppenders();
        Logger.getRootLogger().addAppender(new ConsoleAppender(new PatternLayout("[%c{1}] %m%n")));
        Logger.getRootLogger().setLevel(Level.ERROR);

        // before using any of the methods from the valiation module, we need to initialize the services and context methods;
        // this simple example will use the default implementations, but those classes are designed to be extended to customize the services
        // and add more application-specific context functions...
        ValidatorServices.initialize(new ValidatorServices());
        ValidatorContextFunctions.initialize(new ValidatorContextFunctions());

        // there are two parts to initializing the edits:
        //   1. create a validator object (usually from an XML file); we will use the XmlValidatorFactory for that
        //   2. initialize the engine with the validator, that's where the Groovy gets compiled, so it's not a fast operation
        //      for the SEER edits because they contain so many large contexts...
        if (!editsFile.exists())
            throw new Exception("Unable to find edits file!");
        Validator validator = XmlValidatorFactory.loadValidatorFromXml(editsFile.toURI().toURL());
        System.out.println("Loaded SEER edits " + validator.getVersion() + " with " + validator.getRules().size() + " edits...");
        ValidationEngine.initialize(validator);
        System.out.println("Initialized validation engine with those edits...");

        // this simple example will deal only with single records (as opposed to list of records representing patient sets), therefore
        // we would like to disable the inter-record edits (it's really not necessary, but it's a good example)
        Set<String> toIgnore = new HashSet<String>();
        for (Rule rule : validator.getRules())
            if ("inter-record".equals(rule.getRuleSet().getId()))
                toIgnore.add(rule.getId());
        ValidationEngine.massUpdateIgnoreFlags(toIgnore, null);
        System.out.println("Disabled " + toIgnore.size() + " inter-record edits...");

        // let's use the layout module to validate a fake NAACCR 13 Incidence file, a few comments on this:
        //   1. We could just build a record (a map of strings) in this example and validate that, but it's more fun to use a file
        //   2. The fake file contains 5 records, so we are reading the entire file into memory for simplicity; a real application
        //      should never do that, it should create a LineNumberReader and use the layout.readNextRecord() method
        if (!dataFile.exists())
            throw new Exception("Unable to find data file!");
        LayoutInfo info = LayoutFactory.getLayoutInfo(dataFile);
        if (info == null)
            throw new Exception("Unable to recognize the type of the file!");
        System.out.println("Detected that the data file is a " + info + "...");
        Layout layout = LayoutFactory.getLayout(info.getLayoutId());
        List<Map<String, String>> records = layout.readAllRecords(dataFile);
        System.out.println("Read " + records.size() + " records from the data file...");

        // now we are finally ready to validate the records
        int recordCount = 0;
        for (Map<String, String> rec : records) {
            System.out.println("Validating record #" + (++recordCount) + " (" + rec.get("primarySite") + ")");

            // let's create a validatable object so the engine understands the type of entity we are validating; we are using a simple implementation
            // of a validatable as an example, but more complex applications will usually wrap their entities into their own implementation...
            Validatable validatable = new SimpleNaaccrLinesValidatable(rec);

            // call the validate method; there is no order on the returned failures; if a record is error free, the engine returns an empty collection
            Collection<RuleFailure> failures = ValidationEngine.validate(validatable);

            // let's display the results
            System.out.println("   > got " + failures.size() + " failures");
            int failureCount = 0;
            for (RuleFailure failure : failures) {
                System.out.println("        >> " + failure.getMessage());
                if (++failureCount > 5) {
                    System.out.println("        >> ...");
                    break;
                }
            }
        }
    }
Here is the output when running the example:
        Loaded SEER edits SE13-009-01 with 700 edits...
        Initialized validation engine with those edits...
        Disabled 23 inter-record edits...
        Detected that the data file is a NAACCR 13 Incidence [3,339 char]...
        Read 5 records from the data file...
        Validating record #1 (C755)
           > got 72 failures
                >> Registry ID is not valid.
                >> Year of Conclusive DX is not valid.
                >> Month of Conclusive DX is not valid.
                >> Year of Multiple Tumors is not valid.
                >> Month of Multiple Tumors is not valid.
                >> Day of Multiple Tumors is not valid.
                >> ...
        Validating record #2 (C254)
           > got 46 failures
                >> Behavior ICD-O-3 must be 2 or 3.
                >> Registry ID is not valid.
                >> Day of Conclusive DX is not valid.
                >> Year of Conclusive DX is not valid.
                >> Month of Conclusive DX is not valid.
                >> Year of Multiple Tumors is not valid.
                >> ...
        Validating record #3 (C139)
           > got 37 failures
                >> Registry ID is not valid.
                >> Day of Conclusive DX is not valid.
                >> Year of Conclusive DX is not valid.
                >> Year of Multiple Tumors is not valid.
                >> Month of Multiple Tumors is not valid.
                >> Day of Multiple Tumors is not valid.
                >> ...
        Validating record #4 (C632)
           > got 37 failures
                >> Registry ID is not valid.
                >> Day of Conclusive DX is not valid.
                >> Year of Conclusive DX is not valid.
                >> Month of Conclusive DX is not valid.
                >> Year of Multiple Tumors is not valid.
                >> Month of Multiple Tumors is not valid.
                >> ...
        Validating record #5 (C719)
           > got 84 failures
                >> Registry ID is not valid.
                >> Day of Conclusive DX is not valid.
                >> Year of Conclusive DX is not valid.
                >> Month of Conclusive DX is not valid.
                >> Year of Multiple Tumors is not valid.
                >> Month of Multiple Tumors is not valid.
                >> ...




© 2015 - 2025 Weber Informatics LLC | Privacy Policy