An In-Silico Approach to Identify Ripening Related β-1,3-Glucanases and Their Role in Tomato Fruit †

Tomato, Solanum lycopersicum, is one of the most cultivated fruits. However, between one quarter and half of the production is lost due to uncontrolled conditions during transport and storage. Understanding the biological components and processes that affect the physical properties and structure of tomato fruits during the post-harvest is essential to design new strategies to reduce these losses. The aim of this work is to identify specific cell wall β-1,3-glucanases expressed during tomato fruit ripening using a combination of phylogeny and transcriptomic analysis. Fifty β -1,3glucanases (BG) were identified in-silico using bioinformatic tools. Phylogenetic analysis revealed that tomato genes are distributed in three clusters (α, β and γ) with evolutionary relations previously characterized in the model Arabidopsis thaliana. In Arabidopsis, cluster α comprises all enzymes identified to target intercellular channels named plasmodesmata whereas pathogenesis and stress responsive apoplastic enzymes are grouped in cluster β and γ. Analysis of tomato microarray data showed two types of regulation for BGs expressed in tomato fruit: enzymes in cluster α decreased their expression during the ripening, while enzymes in clusters β and γ increased their expression in mature fruits. The results suggest differential regulation for enzymes in different phylogenetic clusters suggesting evolutionary divergences in function. The analysis also provide new candidates for cell wall modifications that can be further studied as a possible targets in fruit improvement programmes.


Introduction
The tomato fruit, Solanum lycopersicum, is, according to the Food and Agriculture Organization (FAO), the second most cultivated fruit worldwide with 243 million tonnes of fruit produced in 2018 [1]. Exports of tomato fruits from EU countries accounted for 17.7% of all vegetable outputs in the period 2015-2017 [2]. In spite of this, losses in production are considerable and are estimated to range between 24-50% in a single year [3]. Poor conditions during transport and storage from field to market are the main causes for these losses which are influenced by the lack of temperature control, physical injury, pests and disease [3]. Understanding the biological components and processes that affect the shelf-life of the tomato fruit is essential to design new strategies to reduce these losses.
One of the main processes that influences the shelf-life of tomato fruit is the rapid softening during ripening. Softening is a biological process, in which the dissasembly of cell wall (CW), cell turgor, fruit-water status, hydrostatic pressure and the accumulation and distribution of osmotically active solutes are involved [4]. Delaying this process is one of the major targets in fruit breeding programmes. Research on this area focuses on introducing changes in the expression of specific cell wall modifying enzymes (CWME) that target major cell wall components (such as pectins and cellulose) [5]. However, the role of minor components, such as callose (β-1,3-glucan), in tomato ripening and post-harvest shelf-life remains unclear. Callose is a plant cell wall polysaccharide synthesised by callose synthases (Cals) and hydrolysed by β-1,3-glucanases (BG) [6]. BGs are a group of proteins belonging to the glycosyl hydrolase family 17 (GH17) that act in a wide variety of functions in plants [7]. Callose synthesis/degradation has been involved in disease response, in the control of intercellular transport and on the regulation of cell wall integrity and mechanical properties [8].
The objective of this study is to identify, via an in-silico approach, candidate BG enzymes participating in tomato ripening. To identify the candidate BGs genes, this study analysed predicted β-1,3-glucanase structural features (presence of protein sequence domains and catalytic residues); carried out a phylogenetic comparison with known and predicted Arabidopsis thaliana BG genes; analysed the expression of BG genes in tomato using publically available transcriptomic data; and used bioinformatic predictions for cellular and plant localization.

Isolation of β-1,3-Glucanase Gene Sequences and Prediction of Domains and Catalytic Residues
Glycoside hydrolase family 17 (GH17) proteins of Arabidopsis thaliana and Solanum lycopersicum were downloaded from Phytozome v9.022 under the search term 'glyco_hydro_17' to identify genes which were likely to contain GH17 domains. Presence of a GH17 domain was confirmed in these sequences using the SMART [9], Conserved domain [10] and Interpro [11] prediction tools. To identify and remove sequence redundancy, the UGENE [12] bioinformatics software was used to align the obtained sequences using the MUSCLE [13] alignment function.

Comparative Phylogenetic Analysis of Tomato and Arabidopsis glucanases
Arabidopsis thaliana GH17 genes identified in Gaudioso-Pedraza and Benitez-Alfonso (2014) [22] were aligned in UGENE [12] using the MUSCLE [13] alignment tool with the Solanum lycopersicum genes. Phylogenetic analysis was carried out in MEGA-X 10.1.840 software using the Jones-Taylor-Thornton (JTT) model. Conditions and restrictions were varied to produce a range of dendrograms with altered bootstrap values (between 100-1000), branch swap filter settings (none or very weak) and ML Heuristic Method settings (Nearest-neighbour-interchange (NNI)) or Subtree-pruningregrafting-extensive (SPR level 5). Additionally, analysis was repeated using the same settings for full sequences and the domain only sequences. Arabidopsis thaliana GH17 genes with common names were annotated and homologous Solanum lycopersicum genes, which showed consistent close relations with the Arabidopsis thaliana GH17 genes across the different dendograms, were predicted to exhibit similar properties. Confidence of dendograms was related to the clustering of the known Arabidopsis thaliana genes based on Doxey et al. (2007) [23].

Gene Expression Profile Analysis
Microarrays for Solanum lycopersicum fruit development were downloaded from the ArrayExpress [24] database. Arrays were selected and normalised using an in-house R script, developed by Sam Amsbury and Philip Kirk, to look for differentially expressed genes in tomato fruits. Pericarp expression and fruit development data was downloaded directly from the SGN Tomato Expression Atlas (TEA) [25] and TomExpress [26]. Expression trends were analysed and compared between TEA, TomExpress and Genevestigator [27] data to validate results. Gene intracellular localisations were predicted using PSORT [28] and WoLF [29] prediction tools and analysed to identify cell wall proteins. Tissue localisation expression data was downloaded from Genevestigator [27] to analyse fruit specific and non-specific genes and compared to predicted localisation using the ePlant BAR [30] database.

Phylogeny Find Tomato Genes Closely Related to Known Arabidopsis BG Genes
In order to identify similar funcion of the BG from tomato plants with known Arabidopsis BGs genes, a phylogenetic tree was created (Figure 1). Identification of GH17 containing BGs in the tomato plant was done using Phytozome v12.122 database. A total of 51 genes were identified. Those genes were aligned using the UGENE [12] software to identify redundant sequences. Two genes, Solyc01g059980 and Solyc01g060020 were found to encode the same amino acid sequence, thus Solyc01g060020 was removed from the analysis.
The phylogenetic tree (Figure 1) showed that the BGs genes from tomato are categorised into three clusters, α, β and γ, as it has been shown for Arabidopsis in Doxey et al. (2007) [23]. In Arabidopsis, cluster α comprises all the proteins previously identified to localize at cell wall microdomains named plasmodesmata (PDBG), while enzymes in clusters γ include pathogenesisinduced proteins (BG1-BG5) [Error! Bookmark not defined.]. This suggests that tomato genes residing in the α cluster may function at plasmodesmata, thus controlling cell-to-cell signalling, while genes included in cluster γ could be related to pathogenesis. Phylogenetic analysis showed Plasmodesmallocalised BG 1 (PDBG1), PDBG2 and PDBG3 closely related to Solyc01g005830, Solyc04g007910 and Solyc08005000 tomato genes with probabilities between 90 to 70 bootstrap values. Pathogenesis related (PR) Β-glucanase 1 (BG1), BG2, and BG3 genes showed to be closely related with tomato genes Solyc10g079860 and Solyc04g016470 (91 bootstrap value).
Additionally, the 50 identified tomato protein sequences were screened for the common BG domains; signal peptide (SP), the callose binding X8, and glycosylphosphatidylinositol (GPI) anchor, as described in Materials and Methods, to predict the structure and localisation of the proteins ( Figure  1). Based on the data from the prediction tools, all identified genes contained a SP, with the exception of the γ cluster genes Solyc11g065300, Solyc11g065280, Solyc01g060010 and Solyc00g202560, and the α cluster gene Solyc06g073710. Research has identified two conserved glutamate (E) residues in GH17 proteins that characterize the catalytic domain of glycosyl hydrolases [Error! Bookmark not defined.]. These glutamate residue were identified within the tomato sequence by comparison to the catalytic residue positions in Arabidopsis identified by Gaudioso-Pedraza and Benitez-Alfonso (2014) [22]. Some of the genes lack one or both of the catalytic residues required to hydrolyse callose. Of the 50 tomato sequences, 36 showed the presence of both residues, 13 showed the presence of only one of the two, and 1 that showed no presence of either residue.

Figure 1. Phylogenetic relationships of β-1,3-glucanase genes in tomato and Arabidopsis.
Amino acid sequences of predicted β-1,3-glucanase genes were aligned and a phylogenetic tree created using MEGA-X40 (using the WAG model and a bootstrap value of 1000). The tree is separated into clades (α, β and γ) as defined in Doxey et al. (2007) [23] for Arabidopsis. Predicted domains for each gene are indicated by the coloured spots: red/triangle = signal peptide, green/circle = GPI-anchor, blue/square = X8 domain, 'x' = no domains predicted. Presence of catalytic residue is indicated with number of catalic residues (0, 1 or 2) near to the coloured spots.

Identified Tomato BG Genes Show Varying Levels of Expression across Different Tissues and Organs
Tissue expression data for the 50 identified tomato BGs genes was extracted from the Genevestigator [27] database. The results were verified by comparing expression data in the ePlant BAR [30] database. The expression of these enzymes is shown to vary across tissues and organs (Data not showed). Based on the expression data from Genevestigator [27], one gene, Solyc04g016470, is found to be expressed specifically in tomato fruit. The seven genes: Solyc05g015160, Solyc05g015170, Solyc04g011720, Solyc04g011730, Solyc03g058450, Solyc04g051590 and Solyc05g054440, show similar tissue expression patterns with expression in flower and reproductive tissues and low expression elsewhere in the plant. In contrast, expression in the flower is uncommon for the other genes.

BGs Expression Change When Comparing Post-Anthesis and Ripened Fruits
Microarray data of fruit pericarp genes was downloaded from the SGN Tomato Expression Atlas (TEA) [25]. Expression values and trends were analysed and, of the 50 tomato genes, 16 were γ β α identified as displaying significant expression in developing fruit from anthesis to red ripe stage ( Figure 2). The data showed different patterns of expression, some genes were found to increase in expression during fruit ripening and some were found to decrease in expression during fruit ripening. The expression of these genes correlated well with their evolutionary relations. Expression of enzymes grouped in cluster α decreased during the ripening, while enzymes in clusters β and γ increased. Interpretation of this result in light of the activities identified in Arabidopsis suggest that BGs acting at plasmodesmata are required to improve cellular communication in early developing fruits whereas defense enzymes are required later in mature fruits suceptible to pathogen attack. The expression trends were supported by comparison to TomExpress [Error! Bookmark not defined.] and Genevestigator [27] expression data. Expression levels (in RPM) of different β-1,3-glucanases during ripening, from fruit anthesis to red ripe (DPA = days past anthesis), based on data extracted from the SGN-TEA database [25]. Left panel shows genes with decreased expression during pericarp ripening belonging to the cluster α. Rightt panel shows the genes with increased expression during pericarp ripening.

Conclusions
The aim of this publication was to explore the family of β-1,3-glucanases genes in tomato fruit using a combination of phylogeny and transcriptomic analysis, in order to propose new targets to modify cell walls (and tomato softening) for fruit breeding programmers. 50 callose degrading enzymes (β-1,3-glucanases) were identified in-silico using bioinformatic tools. Phylogenetic analysis revealed tomato genes being distributed in three cluster (α, β and γ), expression of enzymes in cluster α (which comprises all previously localized plasmodesmata proteins identified in Arabidopsis) decrease during the ripening, while enzymes in clusters β and γ (including Arabidopsis pathogenesis-induced proteins) increase. This result allows the identification of specific β-1,3glucanases in tomato fruit that can influence fruit ripening ( Figure 2) and that could be used as a target to delay softening thus to improve tomato shelf life.
Author Contributions: C.P. and Y.B.-A. conceived and designed the experiments and the project; L.P. performed the experiments; L.P., C.P. and Y.B.-A. analyzed the data; L.P., C.P. and Y.B.-A. wrote the paper.