TEA

Methylation Landscapes

mtable

5-methylcytosine (5mC) in genome are often classified in three sequence context CG, CHG, CHH and the location where the C is in. To crop the essence of genome methylation status and to meet the efficiency for performing analysis online, we introduce a straightforward method to measure the methylation landscapes regarding to the sequence contexts.

Briefly, we count the number of reads mapped to each C in a minimum threshold of 4, make an average on the methylation percentage at least five C site of each sequence context type:

That is to say, we give six measurements (i.e., pmt-CG, pmt-CHG, pmt-CHH, gene_CG, gene_CHG, gene_CHH ) of a gene.

This method is implement to an in-house program EpiMolas.jar to process BS-Seq mapping results into a small, tab-delimited data file, mtable :

gene_id	pmt_CG	gene_CG	pmt_CHG	gene_CHG	pmt_CHH	gene_CHH
AT1G01010	0.011463	0.053009	0.010000	0.011635	0.021765	0.012631
AT1G01020	0.000000	0.081519	0.006957	0.007177	0.003614	0.007521
AT1G01030	0.005385	0.012800	0.002439	0.023452	0.003116	0.016939
AT1G01040	0.011200	0.589821	0.009677	0.015773	0.016944	0.011699
AT1G01046	0.765250	0.385000	0.022500	0.058750	0.014325	0.047727
........

Note The align of column name and value is not changed to make perfect view on webpage, but they do seperated to the neighbors by tab. You can find seven column names in the first row, and the six data columns, lead by the gene id column.

These measurements are a normalized score from 0 (all observed sites are unmethylated) to 1 (all observed sites are methylated), or "NaN" for genes which do not have sufficient reads/sites to calculate the value. A deviation of 0.1 on the measurement reflects 10% of Cs on the observation change the methylation state.

Executing EpiMolas.jar to generate mtable

Check The Java Environment

Before you run the EpiMolas.jr, please check the java environment installed properly in your linux environment. For example, simply type a version check :

java -version

and you will get a return like:
```
  openjdk version "1.8.0_45-internal"
  OpenJDK Runtime Environment (build 1.8.0_45-internal-b14)
  OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode) 
```
If not, you need to ask the administrator's help for installing Java.
Download the EpiMolas.jar from github (move to the directory that you want to save the script):

wget https://github.com/markchiang/EPI-MOLAS/releases/download/0.3/EpiMolas.jar
Converting the report to mtable:

First, we assume that you have completed the mapping process and had the right mapping report, *.CGmap from BS-Seeker2, or CXreport.txt from Bismark

The Usage: java -jar EpiMolas.jar the_input_mapping_report_file gtf > the_output_file

mtable from BS-Seeker2 CGmap output file (e.g., the input file: my.CGmap and the output mtable file: result.mtable
java -jar path_to/EpiMolas.jar path_to/my.CGmap path_to/TAIR10.gtf > result.mtable &

mtable from Bismark CX_report output file (e.g., the input file: my.CX_report.txt and the output mtable file: result.mtable &
java -jar path_to/EpiMolas.jar path_to/my.CX_report.txt & path_to/TAIR10.gtf > result.mtable &

You need to indicate paths to the required files (EpiMolas.jar, the input mapping report file, gtf) if they are not in the same directory where you execute EpiMolas.jar.

You may specify -Xmx on the maximum RAM memory in use and -Xms on the initial memory. Emprically, if you have a *.CGmap file in size of X Gb, you may assign 2.3*X GB in the -Xms to ensure the success of run.

It will look like $java -Xms10G -jar EpiMolas.jar exp1.CGmap TAIR10.gtf > exp1.mtable & if 10G RAM is allocated as the initial memory of run.

Institute of Information Science, Academia Sinica, TAIWAN.

Lastest update 2016/07/15