MaskMiner Project Homepage
MaskMiner is a pattern discovery tool, developed at the Computer Science Department of the University of Pisa, based on a novel definition of pattern: that of masks. The interested user can find all the details here.
Command line arguments
To run MaskMiner a Linux user can use the maskMiner.sh script (maskMiner.bat in Window). They accept the following parameters:
maskMiner availableMem inputPath maskLength quorum maxNumDonts
availableMem: the maximum amount of memory can be used by the tool (in MB).
inputPath: the path of the file containing the input genomic sequence. Only the 'A','C','G','T' characters are allowed. Comments are not allowed.
maskLength(<=32): the length of the masks.
quorum(<=32767): the quorum threshold.
maxNumDonts(<=maskLength): the maximum number of allowed don't cares in a maximal mask.
Example: maskMiner 256 input.txt 32 120 4
MaskMiner produces in output 3 files:
The maximal masks, one per row.
The instance file. Each row contains one mask instance, the size of its occurrence list and its occurrence list. Instances of the same maximal mask are reported in consecutive rows.
The file with the statistics. Each row contains the statistic's name and its value.
Download fastutil and dsiutil.
Copy the fastutil and dsiutil jar files in the same directory where the content of the MaskMiner zip file have been decompressed.
Run MaskMiner (ex. maskMiner 256 input.txt 32 120 4)
maskMiner.jar: the jar file (requires JAVA 1.6).
maskMiner.sh: the Linux script to run MaskMiner.
maskMiner.bat: the Window script to run MaskMiner.
*.txt: some input sequences.
The fastutil and the dsiutil jars are not shipped with MaskMiner but they can be downloaded for free from their web sites.