[ad_1]
Discussion
The proposed algorithm applied to the CCR Nutrition database, which is a multicente case–control study conducted in a population of 1496 pairs of Moroccan subjects with and without CRC, identified 2 dietary profiles associated with CRC: the ‘dangerous pattern’ and the ‘prudent profile’. The ‘dangerous pattern’ was characterised by a high consumption of vegetable oil, cakes, chocolate, cheese, red meat, sugar and butter. While the ‘prudent pattern’ was characterised by a moderate consumption of almost all foods with a slight increase in fruits and vegetables. The frequency of cases was higher in the ‘dangerous’ group than in the ‘prudent’ group.
This study proposes a new methodological approach that combined two unsupervised machine-learning techniques: PCA and K-means. The K-means method has been applied in the PCA-subspace. Several studies have shown the advantages of this approach.8 18 28 Indeed, the continuous solution of the cluster indicators is given by the principal components of the PCA and the optimal solution of the K-means clustering is in the PCA subspace. Moreover, the performance of clustering is better at reduced cost and noise. A recent statistical methods review for dietary pattern analysis reported the advantages and the disadvantages of PCA and k-means clustering algorithm. Compared with traditional statistical methods, classification via machine learning techniques reduces misclassification rate, increases generalisability, allows grading of movement quality, and simplifies experimental design.
Other strengths of our research should be mentioned; first, it is the first study on the clustering of dietary profiles related to CRC in Morocco by an unsupervised machine learning approach, according to the literature search. On the other hand, in our case–control study, we included recent diagnosed CRC cases to avoid diet changes. In addition, trained interviewers ensured FFQ questionnaires fulfilment in order to maintain the responses objectivity.15
Two limitations of our study must be highlighted; the first one, our clustering was based on food groups containing foods known to be protective against CRC and others known to be risk factors. Thus, clustering of these foods may neutralise their effects and make discrimination difficult. The second one, food consumption was based on frequencies without considering the daily quantities which can influence the clustering.
A recent study used Global Dietary database (Canada, India, Italy, South Korea, Mexico, Sweden and the USA) found that CRC could be predicted based on a list of important dietary data using supervised and unsupervised machine learning approaches. This study identified the following two patterns, total fat, mono unsaturated fats, linoleic acid, cholesterol, omega-6 as moderate to high correlated dietary features to positive CRC, and fibre and carbohydrates as negative correlation with CRC cases. A systematic review of 17 years of evidence (2010–2016) revealed two distinct global dietary patterns related to CRC risk: a ‘healthy’ pattern, characterised by high intake of fruits and vegetables, higher intakes of one or more of the following foods; whole grains, nuts and legumes, fish and other seafood, milk and other dairy products, and an ‘unhealthy’ dietary pattern characterised by high intakes of red and processed meat, sugar-sweetened beverages, refined grains and desserts and potatoes.
Several studies in American, European and Asian populations have found three dietary patterns related to CRC9 11 13 14 30: ‘Western or meat-based diet’ which is related with higher risk of CRC, ‘healthy or conservative or prudent’ which is related with low risk of CRC and ‘low milk and dietary fibre intake or traditional’ which is relatively related with higher risk of CRC. We could not obtain a very clear group due to diverse nature of nutrition landscape in the Moroccan population, although there were higher intakes of some harmful foods in the cases compared with the controls (meat, sugar and chocolate). The difference in poultry consumption was non-significant between the two clusters, which was similarly reported in a previous study.31
The perspectives of this work are as follows: first to repeat the clustering process, but this time with single foods to overcome the limitation of grouping protective and risk foods in the same group, and neutralise their effect. Second, to develop an easy and user-friendly web application that allows the simple user to identify him/herself in a dietary pattern and evaluate whether he/she is following a healthy diet or not, which is the best approach to make a personal prevention as recommended by the latest WHO guidelines.32
[ad_2]
Source link




