Computational Identification of miRNAs and Tempera PDF
Computational Identification of miRNAs and Tempera PDF
Please note that this article has not completed peer review.
Peijing Zhang
Zhejiang University Life Science Institute
Yujie Chen
Inner Mongolia University for the Nationalities
Ming Chen
Zhejiang University
[email protected] Author
ORCiD: https://orcid.org/0000-0002-9677-1699
10.21203/rs.3.rs-22698/v1
SUBJECT AREAS
Plant Physiology and Morphology Plant Molecular Biology and Genetics
KEYWORDS
mango (Mangifera indica), miRNA, lncRNA, stress response, target genes,
computational study
1
Abstract
Background
Mango is a major tropical fruit in the world and is known as the king of fruits because of its flavour,
aroma, taste, and nutritional values. Moreover, various parts of mango trees have been used for
medical purposes. Although various regulatory roles of miRNAs and lncRNAs have been investigated
in many plants, there is yet an absence of study in mango. This is the first study to provide
information on ncRNAs of mango with the aim of identifying miRNAs and lncRNAs of mango and
discovering of their potential functions by the interaction prediction of the miRNAs, lncRNAs and their
target genes.
Results
In this analysis, 104 miRNAs and 7,610 temperature responsive lncRNAs were identified and the
target genes of these ncRNAs were characterized. By analysing the interaction of miRNAs and their
target genes, it was observed that miRNAs are mainly involved in growth, development, and stress
responses of mango. For the lncRNAs, cold responsive lncRNAs bound to low temperature responsive
proteins expressed at low temperature stress. GO enrichment analysis of heat and cold responsive
lncRNAs revealed that they involved in all three basic processes; biological process, cellular
component, and molecular function. Moreover, mango lncRNAs can target miRNAs to reduce the
Conclusion
This paper would provide the new information about miRNAs and lncRNAs of mango and would help
Background
Non-coding RNAs (ncRNAs) are RNA molecules that have no or little protein coding potential and are
not translated into proteins although they are transcribed from DNA. Small ncRNAs such as microRNA
(miRNA), small interfering RNA (siRNA), small nucleolar RNA (snoRNA) and piwi-interacting RNA
(piRNA) are shorter than 200 nucleotides (nt) in length and piwi-interacting RNA (lncRNAs) are longer
The miRNAs are small (18–24 nt), endogenous and regulatory RNA molecules derived from their long
self-complementary precursor sequences which can fold into hairpin secondary structures [1]. In
2
plants, these long primary precursor miRNAs are transcribed by RNA polymerase II or RNA polymerase
III and then processed by dicer-like 1 enzyme (DCL1) into miRNA/miRNA* duplex [1, 2]. Finally the
mature miRNAs are incorporated into an RNA-induced silencing complex (RISC) [3]. The binding of
miRNAs to their targeted mRNAs in a perfect or nearly perfect complementarity suggests a method
for identifying their targets by BLAST analysis or other related publicly available software [4]. Many
experimental researches have proved that miRNAs involve in many important biological and
metabolic processes. In plants, miRNAs play a fundamental role in almost all biological and metabolic
processes including plant growth, development, signal transduction and various stress response by
LncRNAs are a family of regulatory RNAs having a minimal length of 200 nt and unable to encode
proteins. Most lncRNAs are transcribed by RNA polymerase II although some are transcribed by RNA
polymerase III [6–8]. LncRNAs can interact with ncRNAs such as miRNAs [9]. LncRNAs not only can
target to miRNAs to reduce the stability of lncRNAs but also can function as molecular decoys or
sponges of miRNAs [10]. Moreover, lncRNAs can compete with miRNAs to bind to their target mRNAs
and are the precursors for the generation of miRNAs to silence target mRNAs [11]. Many evidence
showed that plant lncRNAs play an important role in fundamental biological processes including the
growth and development and abiotic stress responses [12, 13]. But, the molecular basis of how
lncRNAs function and mediate gene regulation is still poorly understood [14].
The genus Mangifera belongs to the family Anacardiaceae and contains about 69 different species.
Mangifera indica, L (mango) is the most common species among them [15, 16]. Mango is one of the
main tropical fruits over the world and is believed to be originated from Asia [17]. The well-known
countries for mango cultivation are China, India, Thailand, Pakistan, Mexico, Philippine and Myanmar.
The annual production of mango is approximately 42 million tons which is second after banana
production [18]. Mango is called as the king of fruits because of its special characteristic flavour,
pleasant aroma, taste, and nutritional values. Both ripe and raw fruits can be used as the food
products such as pickles, juice, jam, powder, sauce, cereal flakes and so on [19]. Moreover, various
parts of mango trees have been used for medical purposes since long times ago, mostly in Southeast
3
Asian and African countries [20]. In vitro and in vivo studies have been indicated the various
Although mango is a popular plant with many important usages, its ncRNA data is still limited. Over
10,000 miRNA data of several plants can be accessed in miRNA database, miRBase, but mango
miRNAs and their functions have not yet been identified. The regulatory roles of lncRNAs and the
molecular basis of lncRNA-mediated gene regulation are also still poorly understood in plants
including mango. So, the aim of this research work is to identify and study about the miRNAs and
lncRNAs of mango and to examine their potential functions by the interaction prediction of the
Results
Identification and characterization of mango miRNAs
From the mango unigene sequences, we have identified 104 miRNAs by following the identification
workflow explained in Figure 1. The length of the resulting mature miRNAs is in the range of 18-22 nt.
Among them, nearly 40% (41 miRNAs) of mango mature miRNAs are in the length of 18 nt and 6
miRNAs have the length of 22 nt. 32 miRNAs, 17 miRNAs and 8miRNAs are 19 nt, 20 nt and 21 nt of
length respectively (Figure 2A). However, the precursor length of mango miRNAs (MmiRs) was varied
significantly from 67 to 144 nt with an average length of 94 nt. The secondary structure of precursor
sequences was predicted by Zuker folding algorithm in MFOLD. The hairpin structures of five miRNAs
are shown in Figure 2B. Average MFE of pre-miRNAs is 29.92. The MFEI values were also calculated
and were in the range of 0.7 to 1.45 with the average MFEI of 0.84 (Table S1).
According to the result of target gene prediction by psRNATarget server, all the newly identified
mango miRNAs could bind to their targets and a total of 2,347 target genes were predicted for 104
mango miRNAs. The predicted target genes were annotated and assigned to GO term by BLAST2GO.
According to the result of GO analysis, the predicted target genes of mango miRNAs involved in all
4
three broad categories; biological processes, cellular components and molecular functions. Among
2347 target genes, 2081 target genes were enriched in 925 GO terms, and 502 in biological process,
127 in cellular component and 296 in molecular function. Highly enriched GO terms of miRNA target
genes were visualized in Figure 2C. From KEGG pathway analysis, 310 targets were involved in the
103 different KEGG pathways. Purine metabolism was the pathway with the highest 136 target genes.
The predicted miRNAs, their target genes, target descriptions, target GO terms and target KEGG
For the identification of lncRNAs, a total of 277,071 RNA transcripts from Zill, Shelly and Keitt mango
cultivars were used. First, the sequences less than 200 nt were removed because lncRNAs were
always longer than 200 nt. Then, the coding transcripts were removed by their protein-coding
potential, homology with known proteins, and potential ORFs. Finally, the house keeping RNAs and
precursor of miRNAs were removed. After a series of filtering steps, a total of 31,226 candidate
The temperature responsive lncRNAs were then defined by fold change value and FDR. Fold change
value of <-2 or >2 and FDR adjusted p-value 0.05 were used to filter out the significantly expressed
mango lncRNAs, and as the result, 24 lncRNAs were significantly expressed to heat stress (55°C hot
water brushing) and 7586 lncRNAs to cold stress (5°C, 8°C or 12°C) (Figure 3A). In heat responsive
lncRNAs, 18 lncRNAs were upregulated and 6 lncRNAs were downregulated. The length of heat
responsive lncRNAs was ranging from 213 to 1186 nt (Table S3). Among the 7619 cold responsive
lncRNAs, 4335 were upregulated and 3251 were downregulated. The length of cold responsive
The mango lncRNAs were searched by using BLASTn against the plant lncRNA database, CANTATAdb,
with e-value cutoff 1e-20 to check their evolutionary conservation. As the result, no heat responsive
lncRNAs was conserved and 22 cold responsive lncRNAs were conserved with 12 different plant
5
Target gene prediction of lncRNAs
To analyse the interaction of newly identified lncRNAs of mango with protein-coding genes, the
lncRNA target prediction tool LncTar was used. A total of 1998 mango mRNAs downloaded from NCBI
were used for the target prediction of lncRNAs. From the resulting data, 6975 lncRNAs interacted with
1985 target mRNAs. To analyse the functional overview of identified lncRNAs, the targets of the
identified lncRNAs were predicted by BLAST2GO. Among 24 heat responsive lncRNAs, 8 lncRNAs had
115 target genes (SCL14, STP13, Hsp70, At4g39970, ACO1 and so on) involved in the plant
development and stress response. In cold responsive lncRNAs, 6951 lncRNAs interacted with 1985
target genes. The WRKY proteins are a large family of transcriptional regulators in higher plant and 64
cold responsive lncRNAs interact with WRKY gene family in this study (Figure 3C).
Moreover, functional prediction of the target genes of identified lncRNAs were performed by GO
enrichment analysis and the resulting data showed that 11 GO terms were enriched in biological
process, 8 GO terms in cellular component, and 3 GO terms in molecular function for heat responsive
lncRNAs. In the biological process, metabolic process, cellular process, cellular component biogenesis
and cellular metabolic process were highly enriched. In the cellular component analysis, GO terms
associated with membranes and intracellular were highly enriched. Catalytic activity and binding GO
terms were highly enriched in molecular function analysis (Figure 3D). For the target genes of the
cold responsive lncRNAs, 40 GO terms were highly enriched; GO terms of 20 were enriched in the
biological process, 7 in cellular component and 14 in molecular functions. Metabolic processes and
cellular processes were highly enriched for biological processes. In the cellular component analysis,
GOs related to membranes, intracellular and cytoplasm were highly enriched. For molecular function
analysis, most of the enriched GO terms were related to catalytic activity and binding (Figure 3E).
From the results of the KEGG pathway analysis, heat responsive lncRNAs had target genes involved in
17 KEGG pathways (Table S5). Among these different pathways, amino sugar and nucleotide sugar
metabolism was the most significant pathway and 8 target genes involved in this pathway. For cold
responsive lncRNAs, 209 target genes had mapped to 87 KEGG pathways (Table S6). JK513026_1,
alcohol dehydrogenase 1 (ADH1, EC:1.1.1.1) was the most enriched target gene and involved in 12
6
different pathways.
To analyse the direct interaction of miRNAs and lncRNAs of mango, the psRNATarget server was used
to predict the target lncRNAs of miRNAs. The resulting data showed that 3 heat responsive lncRNAs
interacted with 6 miRNAs (Table S7). For cold responsive lncRNAs, 763 lncRNAs had 1203 pairs of
The miRNA target mimicry search was also performed by using TAPIR. No heat responsive lncRNA act
as the target mimic of miRNAs. But 20 cold responsive lncRNAs were predicted as the target mimics
of 20 miRNAs (Table S9). CRlnc31221 was the target mimic of MmiR5408 which targeted to 8 cold
responsive lncRNAs and 47 target genes (Figure 4B). Base-pairing interaction between MmiR5408 and
its target mimic cold responsive lncRNA, CRlnc31221 was shown in Figure 4C.
The interaction network of mango ncRNAs (miRNAs, lncRNAs and mimic) and their target genes was
visualized by using Cytoscape contained a total of 5388 pairs of interaction among miRNAs, lncRNAs
and their targets (Figure 4A). These interactions were 4155 pairs of 104 MmiRNAs and 2347 mRNAs,
1203 pairs of 89 MmiRNAs and 763 CRlncRNAs, 6 pairs of 6 MmiRNAs and 4 HRlncRNAs, and 24 pairs
Discussion
Identification, characterization and target gene prediction of miRNAs
Most of the plant miRNAs are evolutionarily conserved from species to species [22, 23] and this
indicates the powerful strategy for the identification of new miRNAs by using the already known
miRNAs [24]. Many conserved miRNAs have been identified from the expressed sequence tag (EST)
[25, 26] and genome survey sequence (GSS) [27] by using this homology search approach. For
mango, there is no GSS data and the available EST data for mango is only 1709 and it was not
sufficient for identification of miRNA. Hence, unigene sequences (107,744) were used for the
identification of miRNAs in this study. Unigene is a unique transcript that is transcribed from a
genome and many miRNAs have been identified from the unigenes of many plant species such as
Artemisia annua [28], coconut [29], Litchi fruit [30] and black pepper [31].
7
The potential 104 pre-miRNAs of mango were predicted based on the parameters of Zhang[25] and
the MFEI values were also calculated as the MFEI gave the best prediction of miRNAs [32]. Although
the length of the predicted mature miRNAs was in the range of 18-22 nt, the length of precursor
miRNAs varied significantly from 67 to144 nt with an average length of 94 nt. The predicted 104
mango miRNAs belong to 86 different families. Among them, over 70% of the miRNA families have
only one family member. The highest 5 family members were found in the miR2673 family followed
by miR159, with 4 family members. The remaining miRNAs have the family member of 2 or 3.
Therefore, we can see that the mango miRNA distribution across various families is highly
heterogeneous.
The previous studies have already proved that the plant miRNAs bind to their targets in a perfect or
nearly perfect complementarity and thus the psRNATarget server was used to search the target gene
of mango miRNAs in this study. Both mRNAs collected from NCBI and mRNA identified in this study
were used as the target candidates of miRNAs due to the absence of Mangifera indica target
candidates in psRNATarget server. Some previous studies indicated that miR156 was a master
regulator of the juvenile phase in plants and it targeted the squamosa promoter binding protein-Like
(SPL) gene family to regulate the transition from vegetative phase to floral phase in Arabidopsis,
maize and rice [33-39]. In mango, MmiR105772, a family of miR156, also bound to its target SPL6
and thus the predicted targets of mango miRNAs were in the agreement with the previously published
papers in other plants. The resulting data from psRNATarget also showed that only one miRNA
(MmiR1653) had the single target gene which was the member of miR482 family and bound to
monodehydroascorbate reductase 4 enzyme, the important gene related to the nutritional quality of
mango fruit [33]. All other miRNAs could target to multiple genes and some miRNAs had over two
hundred target genes. For example, MmiR73030 had 230 target genes and these target genes
and so on.
Sivankalyani, Sela et al. published that the mango stress-response pathways were activated by cyclic
8
nucleotide-gated channel (CNGC) and leucine-rich repeat receptor (Lrr) [40]. In this study, we found
that MmiR90392 targeted to CNGC1, and MmiR68471 and MmiR68478 targeted to Lrr2. Moreover,
MmiR10167 and MmiR15558 bound to the stress WRKY transcription factor 44 which play a major role
in plant defence to biotic and abiotic stresses. MmiR78769 and MmiR101928 also bound to their
target genes of phospholipase A and phospholipase D which were key factors in plant responses to
biotic and abiotic stresses [41]. The ethylene response could improve the tolerance of mango fruit to
chilling stress [42] and ten mango miRNAs identified in this study had six ethylene responsive target
genes such as ethylene-insensitive protein and ethylene-responsive transcription factor. So, these
newly identified mango miRNAs have potential roles in chilling stress responsive process of mango.
Two mango miRNAs (MmiR23777 and MmiR36814) also targeted to the auxin efflux carrier which had
the potential role in mango plant organ development [43]. A total of 17 miRNAs interacted with auxin-
related genes. MmiR51876 was a miRNA that targeted to auxin responsive protein. The pentose and
acid metabolism pathway were KEGG pathway involved in the adventitious root formation of mango
cotyledon segments [44]. In this study, 9 miRNAs bound to 8 target genes involved in these three
pathways for mango root formation. MmiR10167 bound to target genes that involved in
phenlypropanoid biosynthesis pathway and MmiR7519 bound to target genes involved in alpha-
linolenic acid metabolism pathway. From these findings, it could be observed that these five mango
As the genome sequence of mango is not available till now, the de novo assembled transcriptome
sequences were used for the identification of lncRNAs in this study. A total of 277,071 RNA transcripts
from Zill, Shelly and Keitt mango cultivars studied by the former researchers were used and a total of
31,226 candidate lncRNAs were predicted in this study. Among them, 24 lncRNAs were significantly
expressed to heat stress and 7586 lncRNAs to cold stress. The most significantly expressed down-
regulated heat responsive lncRNA was HRlnc25944 with the fold change value of -6.22. HRlnc11351
9
and HRlnc27371 were the mostly expressed up-regulated lncRNAs with fold change value greater
than 7. For the cold responsive lncRNAs, CRlnc10871 was the mostly expressed down-regulated
lncRNA (FC value -11.19), and CRlnc26299, CRlnc30496 and CRlnc36473 were the most significantly
No heat responsive lncRNAs was conserved but 0.29% of cold responsive lncRNAs were conserved
with 12 different plant species. Among them, the highest conserved lncRNAs were CRlnc32663 and
CRlnc47883. Each of which was conserved with 4 different lncRNAs of other plants. CRlnc32663
conserved with 4 different lncRNAs of 3 three different plant species such as Manihot esculenta, Malus
domestica and Populus trichocarpa. CRlnc42883 also conserved with 4 lncRNAs of Oryza rufipogon,
For heat responsive lncRNAs, 8 bound to 115 target genes involved in the plant development and
stress response. HRlnc11351 was the most significantly expressed up-regulated lncRNAs with fold
change value of 7.55 and bound to six heat shock proteins. In cold responsive lncRNAs, CRlnc26299
was one of the most significantly expressed up-regulated lncRNAs and bound to RC12B (JK513200_1)
which is the low temperature and salt responsive protein found in Arabidopsis thaliana [45]. The
WRKY proteins are a large family of transcriptional regulators in higher plant and are exhibited the
variable expression patterns in response to chilling stress in cucumber, mango and rice [40, 46, 47].
In this study, 64 cold responsive lncRNAs have interaction with WRKY gene family. So, we can observe
that the cold responsive lncRNAs of mango have the interaction with the target genes that are
GO enrichment analysis and KEGG pathway analysis were performed for the better understanding of
the target genes of newly identified lncRNAs. From GO enrichment analysis result, we could see that
both types of heat responsive lncRNAs and cold responsive lncRNAs had interaction with the target
genes involved in all three broad categories such as biological process, cellular component and
molecular function. For the biological processes, both heat responsive and cold responsive lncRNAs
were highly enriched in metabolic processes and cellular processes. In the cellular component
analysis, GOs related to membranes, intracellular and cytoplasm were highly enriched for both type of
10
lncRNAs. Also for molecular function analysis, most of the enriched GO terms in both type of lncRNAs
were related to catalytic activity and binding. Therefore, we could see that, the GO terms highly
enriched in both heat responsive and cold responsive lncRNAs were not quite different.
Among 17 KEGG pathways of the target genes of the heat responsive lncRNAs, amino sugar and
nucleotide sugar metabolism was the most significant pathway and 8 target genes involved in this
pathway. As mentioned above, HRlnc11351 was the most significantly expressed up-regulated
lncRNAs and its target gene, JK513625_1 is 3-ketoacyl-CoA thiolase 2 (KAT2, EC:2.3.1.16) which could
be mapped to 9 different pathways such as benzoate degradation, fatty acid elongation, biosynthesis
of unsaturated fatty acids, alpha-linolenic acid metabolism, fatty acid degradation, valine, leucine and
degradation according to the result of KEGG pathway analysis. In Arabidopsis, KAT2 is an enzyme that
catalyses the β-oxidation of fatty acid and involves in abscisic acid (ABA) signal transduction [48]. The
phytohormone ABA plays an important role in plant development and adaptation to diverse
environmental stresses. Therefore, HRlnc11351 may involve in the important role of mango
development and stress response by targeting to KAT2. For cold responsive lncRNAs, 209 target
genes had mapped to 86 KEGG pathways. JK513026_1, alcohol dehydrogenase 1 (ADH1, EC:1.1.1.1)
was the most enriched target gene and involved in 12 different pathways including
threonine metabolism, methane metabolism, fatty acid degradation and so on. In plants, ADH genes
are involved in mediating stress responses and developments. In mango, ADH1 has important role in
the fruit ripening [49] and thus, cold responsive lncRNAs that target to ADH1 gene may play
important role in mango fruit ripening process. According to the KEGG pathway analysis results,
purine metabolism and biosynthesis of antibiotics were the highly enriched pathways among 86
pathways and more than 50 target genes were enriched in each pathway.
The interaction between the miRNAs and lncRNAs showed that the most of the miRNAs had targeted
to more than one lncRNAs and only 8 miRNAs had single target lncRNAs. The number of lncRNAs
11
targeted by a single miRNA was in the range of 1 to 90. A total of 90 target lncRNAs were found for
MmiR73030 which also targeted 230 mRNAs. This miRNA had the highest target numbers in both
LncRNAs not only can target to miRNAs to reduce the stability of lncRNAs but also can function as
molecular decoys or sponges of miRNAs [[10, 50]]. So the miRNA target mimicry search was
performed by using TAPIR, which is a web server for the prediction of plant miRNA targets including
target mimics. Although no heat responsive lncRNA act as the target mimic of miRNAs, 20 cold
responsive lncRNAs were predicted as the target mimics of 20 miRNAs. CRlnc31221 was the target
mimic of MmiR5408 which targeted to 8 cold responsive lncRNAs and 47 target genes. These target
genes were involved in starch and sucrose metabolism, inositol phosphate metabolism and
phenylpropanoid biosynthesis pathways which pathways are important for plant growth and
development, and plant response towards biotic and abiotic stress. During target mimicry, the
interactions between miRNAs and their authentic targets were blocked by binding of decoy RNA to
miRNAs via partially complementary sequences [51]. So, the target mimicry of CRlnc31221 had the
potential regulation effect to the interaction between the target genes and MmiR5408.
Conclusion
In conclusion, this study identified the 104 miRNAs and 7610 temperature responsive lncRNAs from
mango transcriptome sequences. And the interactions of these ncRNAs with their target genes were
also predicted. According to the result, newly identified mango ncRNAs, like other plant ncRNAs, have
potential role in biological and metabolic pathways including plant growth and developmental
process, pathogen defence mechanism, and stress responsive process. Therefore, the resulting data
of this project may help for the further prediction of the specific functions of mango ncRNAs and wet
lab experiments.
Methods
Data collection
A total of 10,415 plant miRNAs (release 21) were downloaded from miRBase database
12
resulting 6042 non-redundant known miRNAs were used as the reference for the prediction of
conserved miRNAs.
miRNAs.
As the whole genome of mango is not available, the publically available de novo assembled
transcripts were used for the prediction of candidate lncRNAs. A total of 277,071 mango transcripts
obtained from Zill, Shelly and Keitt mango cultivars [40, 52-54] were used in this study. A total of
1998 mango mRNAs downloaded from NCBI were used for the target prediction of miRNAs and
lncRNAs.
First, the homology search of mango unigenes against non-redundant plant miRNAs was performed by
using BLASTn with default parameters. The following criteria were used to choose the candidate
miRNAs; the length of candidate miRNA should be greater than or equal to 18 nt without gap and the
number of mismatches between mango sequences and plant miRNAs should not be more than 2. The
sequences of 100 nt upstream and 100 nt downstream from the BLAST hit were extracted for
precursor sequences. If the length of query sequence was less than 200 nt, the entire sequence was
selected. BLASTx against NCBI non-redundant (Nr) protein databases was used to remove the protein-
coding sequences from the extracted precursor sequences with the e-value cut off 0.01. The
secondary structures of remaining precursor sequences were predicted by using the Zuker folding
algorithm in MFOLD software [55] with default parameters. The workflow for the identification of
miRNAs was briefly described in Figure 1. Based on the parameters of Zhang [25], the potential pre-
secondary structure;
13
3. it should contain the mature miRNA within one arm of the hairpin;
4. the predicted mature miRNAs and its opposite miRNA* sequence in the other arm
6. maximum size of a bulge in the mature miRNA sequences should not be more than 3
nt;
7. the predicted secondary structures should have higher negative minimal free
The following equations were used to calculate the minimal free energy index (MFEI) and adjusted
MFEI = AMFE/(G+C)%
To predict the lncRNAs, the transcripts smaller than 200 nt were firstly removed. The coding potential
of remaining transcripts was then calculated by CPC [56] and LncFinder [57]. Only sequences with the
CPC score less than -1 and LncFinder score less than 0.5 were used for further prediction. The protein
coding sequences were removed by BLASTx against NCBI Nr protein databases and ORFfinder
(https://www.ncbi.nlm.nih.gov/orffinder/) was used to predict the open reading frame (ORF) of the
remaining sequences and the minimal ORF cutoff less than 102 amino acids was applied for
prediction. Then, house-keeping genes were removed against Rfam database (http://rfam.xfam.org)
with e-value 0.001. Finally, to remove the lncRNAs acting as precursors of known or novel miRNAs,
lncRNAs were aligned with precursors of known non-redundant plant miRNAs from the miRBase
database (http://www.mirbase.org/) using BLASTn with the default parameters (Figure 1).
The remaining transcriptome sequences that were not captured as lncRNAs were used as queries
14
against the NCBI Nr protein database using BLASTx with a cutoff e-value of 1e-5. The sequences with
the blast hits were then analysed to remove the house-keeping RNAs. The final sequences were
identified as protein coding sequences in this study for target gene analysis.
From the resulting lncRNA transcripts, the temperature responsive lncRNAs were filtered by two
parameters. The mango lncRNAs with the adjusted p-value of 0.05 and the log2 fold change of greater
Mango mRNAs downloaded from NCBI database and mango protein coding sequences previously
identified were used for the target genes prediction of miRNAs. The putative target sites of miRNAs
were identified by aligning the miRNA sequences using plant target prediction tool, psRNATarget
server (http:// plantgrn.noble.org/psRNATarget/) [58]. To reduce the number of false predictions, the
maximum expectation threshold was set to the value of 3.0. The cut-off length of nucleotides for
complementarity scoring, hsp size, was set as the length of the mature miRNAs. The maximum
energy of unpairing (UPE) the target site was set as 25 kcal. The flanking length around target site for
target accessibility analysis was 17 bp in upstream and 13 bp in downstream. The range of central
mismatch leading to translation inhibition was adjusted as 9-11 nt. No gap and no more than four
mismatches between miRNA and its target (G-U pair count as 0.5 mismatch) was allowed. The target
genes of mango lncRNAs were predicted by using LncTar tool [59] with the normalized binding free
To predict the lncRNAs as the target genes of miRNAs, psRNATarget was used as previously
mentioned in the interaction prediction of miRNAs and mRNAs. For target mimic prediction, TAPIR
server [60] was used in this study. TAPIR is a web server for the prediction of plant miRNA targets
The gene ontology (GO) analysis of the identified target transcripts was executed by combining both
15
BLASTx data and interproscan analysis data by means of the BLAST2go software [61]. The GO
enrichment analysis was performed by using Fisher’s exact test with multiple testing correction of
false discovery rate (FDR). KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis was
also performed for better understanding of the functions of the target genes.
The analysis of conservation of mango lncRNAs was detected by using BLASTn against all lncRNA
sequences from the plant lncRNA database, CANTATAdb [62], with e-value cutoff 1e-20.
Finally, the interaction network of miRNAs, lncRNAs and their target genes were visualized by using
Cytoscape [63].
Abbreviations
ABA
abscisic acid
ACO1
1-aminocyclopropane-1-carboxylate oxidase 1
ADH1
alcohol dehydrogenase 1
AMFE
BLAST
bp
base pair
cytosine
CNGC
CPC
DCL1
dicer-like 1 enzyme
EST
16
expressed sequence tag
FDR
guanine
GO
gene ontology
GSS
Hsp70
KAT2
3-ketoacyl-CoA thiolase 2
Kcal
kitocalorie
KEGG
lncRNA
Lrr
MFE
MFEI
miRNA
microRNA
mRNA
messenger RNA
NCBI
ncRNA
non-coding RNA
ndG
17
normalized binding free energy
Nr
non-redundant
nt
nucleotide
ORF
piRNA
piwi-interacting RNA
RC12B
related cDNA 12 B
RISC
RNA
ribonucleic acid
SCL14
scarecrow-like 14
siRNA
small interfering
snoRNA
SPL
STP13
UPE
Declarations
Ethics approval and consent to participate
Not applicable.
Not applicable.
18
The unigene sequence data are available from the mango RNA-Seq database
from the supplementary data of the published journals [40, 52-54]. All data generated and analysed
during this study are included in this published article (and its supplementary information files).
Competing interests
Funding
This research was supported by the Talented Young Scientist Program organized by the China Ministry
of Science and Technology. Ming Chen’s Lab are grateful to the supports from MOST
Authors’ contributions
NMMM and MC designed the research. NMMM and PZ performed bioinformatics analysis.
NMMM and YC analyzed plant genes data. All authors approved the final manuscript.
Acknowledgements
The author thanks all lab members for their suggestions during this research work.
Author information
Affiliations
Department of Bioinformatics, Key State Laboratory of Plant Physiology and Biochemistry, College of
Life Sciences and Food College, Inner Mongolia University for the Nationalities, Tongliao, Inner
Mongolia, PR China
Authors’ information
Not applicable.
19
Corresponding authors
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
References
1. Kurihara Y, Watanabe Y: Arabidopsis micro-RNA biogenesis through Dicer-like 1
101(34):12753–12758.
and characterization of conserved miRNAs and their target genes in garlic (Allium
3. Bartel DP: MicroRNAs: genomics, biogenesis, mechanism, and function. cell 2004,
116(2):281–297.
4. Zhang B, Pan X, Cobb GP, Anderson TA. Plant microRNA: a small regulatory molecule
5. Rhoades MW, Reinhart BJ, Lim LP, Burge CB, Bartel B, Bartel DP: Prediction of plant
712.
8. Zhang Y-C, Chen Y-Q. Long noncoding RNAs: new regulators in plant development.
20
9. Jalali S, Bhartiya D, Lalwani MK, Sivasubbu S, Scaria V. Systematic transcriptome
10. Salmena L, Poliseno L, Tay Y, Kats L, Pandolfi PP. A ceRNA hypothesis: the Rosetta
11. Yoon J-H, Abdelmohsen K, Gorospe M. Functional interactions among microRNAs and
long noncoding RNAs. In: Seminars in cell & developmental biology: 2014. Elsevier:
9–14.
12. Xin M, Wang Y, Yao Y, Song N, Hu Z, Qin D, Xie C, Peng H, Ni Z, Sun Q. Identification
mildew infection and heat stress by using microarray analysis and SBS sequencing.
13. Zhang J, Mujahid H, Hou Y, Nallamilli BR, Peng Z. Plant long ncRNAs: a new frontier
14. Megha S, Basu U, Rahman MH, Kav NN: The role of long non-coding RNAs in abiotic
16. Slippers B, Johnson GI, Crous PW, Coutinho TA, Wingfield BD, Wingfield MJ.
uniqueness over common cultivars from Florida, India, and Southeast Asia. Genome.
2010;53(4):321–30.
18. Galán Saúco V. Worldwide mango production and market: current situation and future
21
19. Siddiq M, Akhtar S, Siddiq R. Mango Pocessing, Products and Nutrition. Tropical and
20. Mukherjee S. The mango—Its botany, cultivation, uses and future improvement,
22. Dezulian T, Palatnik JF, Huson D, Weigel D. Conservation and divergence of microRNA
23. Weber MJ. New human and mouse microRNA genes found by homology search. FEBS J.
2005;272(1):59–73.
24. Zhang B, Pan X, Cannon CH, Cobb GP, Anderson TA. Conservation and divergence of
25. Zhang BH, Pan XP, Wang QL, George PC, Anderson TA. Identification and
2005;15(5):336–60.
26. Frazier TP, Zhang B: Identification of plant microRNAs using expressed sequence tag
27. Pan X, Zhang B, Francisco MS, Cobb GP. Characterizing viral microRNAs and its
2007;211(1):10–8.
28. Pérez-Quintero ÁL, Sablok G, Tatarinova TV, Conesa A, Kuo J, López C. Mining of
miRNAs and potential targets from gene oriented clusters of transcripts sequences of
22
characterization of miRNA from coconut leaf transcriptome. Journal of Applied
Horticulture. 2015;17(1):12–7.
30. Yao F, Zhu H, Yi C, Qu H, Jiang Y. MicroRNAs and targets in senescent litchi fruit
during ambient storage and post-cold storage shelf life. BMC plant biology.
2015;15(1):181.
gene regulation in black pepper (Piper nigrum L.) using high-throughput small RNA
32. Zhang B, Pan X, Cox S, Cobb G, Anderson T. Evidence that miRNAs are different from
33. Pandit SS, Kulkarni RS, Giri AP, Köllner TG, Degenhardt J, Gershenzon J, Gupta VS.
Expression profiling of various genes during the fruit development and ripening of
34. Gandikota M, Birkenbihl RP, Höhmann S, Cardon GH, Saedler H, Huijser P. The
miRNA156/157 recognition element in the 3′ UTR of the Arabidopsis SBP box gene
2007;49(4):683–93.
35. Wu G, Park MY, Conway SR, Wang J-W, Weigel D, Poethig RS. The sequential action of
2009;138(4):750–9.
37. Chuck G, Cigan AM, Saeteurn K, Hake S. The heterochronic maize mutant Corngrass1
23
38. Jiao Y, Wang Y, Xue D, Wang J, Yan M, Liu G, Dong G, Zeng D, Lu Z, Zhu X. Regulation
2010;42(6):541.
39. Jeong D-H, Park S, Zhai J, Gurazada SGR, De Paoli E, Meyers BC, Green PJ. Massive
dynamics in mango fruit peel reveals mechanisms of chilling stress. Frontiers in plant
science. 2016;7:1579.
during cold storage and chilling injury development in ‘Keitt’mango fruit. Postharvest
43. Li Y-H, Zou M-H, Feng B-H, Huang X, Zhang Z, Sun G-M. Molecular cloning and
characterization of the genes encoding an auxin efflux carrier and the auxin influx
carriers associated with the adventitious root formation in mango (Mangifera indica
44. Li Y-H, Zhang H-N, Wu Q-S, Muday GK. Transcriptional sequencing and analysis of
45. Medina Jn, Catalá R, Salinas J. Developmental and stress regulation of RCI2A
24
comprehensive transcriptional profiling of the WRKY gene family in rice under various
48. Jiang T, Zhang X-F, Wang X-F, Zhang D-P. Arabidopsis 3-ketoacyl-CoA thiolase-2
49. Singh RK, Sane VA, Misra A, Ali SA, Nath P. Differential expression of the mango
2010;71(13):1485–94.
50. Wu H-J, Wang Z-M, Wang M, Wang X-J. Widespread long noncoding RNAs as
2013;161(4):1875–84.
51. Franco-Zorrilla JM, Valli A, Todesco M, Mateos I, Puga MI, Rubio-Somoza I, Leyva A,
Weigel D, García JA, Paz-Ares J. Target mimicry provides a new mechanism for
52. Wu H-x, Jia H-m, Ma X-w, Wang S-b, Xu YQ. W-t, Zhou Y-g, Gao Z-s, Zhan R-l:
53. Luria N, Sela N, Yaari M, Feygenberg O, Kobiler I, Lers A, Prusky D. De-novo assembly
mango (Mangifera indica L.) fruit epidermal peel to identify putative cuticle-
25
associated genes. Scientific reports. 2017;7:46163.
55. Zuker M. Mfold web server for nucleic acid folding and hybridization prediction.
56. Kong L, Zhang Y, Ye Z-Q, Liu X-Q, Zhao S-Q, Wei L, Gao G. CPC: assess the protein-
coding potential of transcripts using sequence features and support vector machine.
2019;20(6):2009–27.
58. Dai X, Zhao PX. psRNATarget: a plant small RNA target analysis server. Nucleic acids
research. 2011;39(suppl_2):W155–9.
59. Li J, Ma W, Zeng P, Wang J, Geng B, Yang J, Cui Q. LncTar: a tool for predicting the
60. Bonnet E, He Y, Billiau K, Van de Peer Y. TAPIR, a web server for the prediction of
8.
61. Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: a universal
Bioinformatics. 2005;21(18):3674–6.
63. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski
26
Supplementary Materials
Supplementary Table 1 miRNAs data
Figures
27
Figure 1
Workflow for identification of ncRNAs (miRNAs and LncRNAs) and their targets. The yellow
rounded rectangle represents the data input and green rounded rectangles for data output.
The blue rectangles are the processing steps and grey cans represent databases.
28
Figure 2
Mango miRNAs (A) Length distribution of miRNAs; (B) Hairpin structures of five precursor
miRNAs (Mature miRNA are highlighted by yellow colour) involved in the development
process of mango; (C) GO enrichment analysis of highly enriched target genes of miRNAs.
29
Figure 3
Temperature responsive lncRNAs of mango. (A) Volcano plot of cold responsive lncRNAs of
mango; green plots represent significantly expressed cold responsive lncRNAs with false
discovery rate (adjusted p-value) of less than 0.05 and log2 fold change of less than -2 or
greater than 2; (B) Conserved number of mango lncRNAs in different plant species; (C)
Interaction subnetwork between three low temperature responsive proteins and cold
targeted by heat responsive lncRNAs; (E) GO enrichment analysis of highly enriched target
30
Figure 4
Interaction network among mango ncRNAs and their targets; Green triangles represent cold
responsive lncRNAs, red triangles represent heat responsive lncRNAs, yellow ellipses are
used for target genes, blue rectangles are for miRNAs and purple V-shapes for target
mimics. (A) Network of interaction among newly identified miRNAs, newly identified lncRNAs
(cold responsive lncRNAs and heat responsive lncRNAs), newly identified target mimic of
miRNAs and mRNAs; (B) subnetwork of interaction among MmiR5408, its target mimic
CRlnc31221, target CRlncRNAs and 47 target genes; (C) Base-pairing interaction between
Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.
SupplementaryDataSet1.txt
SupplementaryDataSet2.txt
SupplementaryTable4.xlsx
SupplementaryTable5.xlsx
SupplementaryTable6.xlsx
SupplementaryTable7.xlsx
SupplementaryTable8.xlsx
SupplementaryTable9.xlsx
31
SupplementaryTable2.xlsx
SupplementaryTable3.xlsx
SupplementaryTable1.xlsx
32