ANALYSIS AND FORECASTING OF THE SPREAD OF ONCOLOGICAL DISEASES IN THE KHABAROVSK TERRITORY
ANALYSIS AND FORECASTING OF THE SPREAD OF ONCOLOGICAL DISEASES IN THE KHABAROVSK TERRITORY
Daria Bondar
student, Pacific National University,
Russia Khabarovsk
Elena Agapova
docent, Pacific National University,
Russia Khabarovsk
Anna Ostapenko
docent, Pacific National University,
Russia Khabarovsk
The application of statistical data processing methods is necessary to study the dynamics of all areas of human activity development, including medicine. Analysis of data on cancer patients makes it possible to determine their most widespread diagnoses and most often encountered age groups, which can further assist in prescribing their treatment. Besides, based on the data given for a certain period, it is possible to predict the further development of the disease using neural networks.
The purpose of the work is statistical study of data on cancer patients provided by the Regional Clinical Center of Oncology in Khabarovsk, as well as prediction of the disease development dynamics using time series and neural networks.
To conduct the research, a data registry containing information on cancer patients in the Khabarovsk Territory was obtained.
The research showed that women are more amenable to cancer, which may be due to the anatomical and physiological peculiarities of the female body.
During the study, there were identified areas, in which cancer diseases are more common than in other areas of the Khabarovsk Territory, these are such areas as Khabarovsk, Komsomolsk, Khabarovsk District, Amur District, and Lazo District. The data obtained are shown in Figure 1.
Figure 1. Number of cancer patients by district
While studying the age factor of cancer patients, the groups shown in Figure 2 were identified.
Figure 2. Number of cancer patients by age
The result of study of age group patients revealed that the most common age of cancer patients is 56-80 years that is 64% of the number of all patients.
According to the data obtained, the most widespread diagnoses in the Khabarovsk Territory are malignant disease of the reproductive organs (20%), breast cancer (19%) and skin neoplasms (13%).
The initial data contained information on when the patient was registered and diagnosed.
Let us consider a time series containing data on the number of cancer patients with reproductive organ disease by years.
Let us build a linear graph showing the dependence of the number of cancer patients on the year of diagnosis (Figure 3).
Models based on time series with the help of neural networks tend to be some of the most accurate ones, so they are often used for prognosis.
Figure 3. Dependence of the number of cancer patients on the year of diagnosis
To estimate the quality of trained on the basis of a time series of neural networks can be made possible by looking at the scatter graph of targeted and output variables (Figure 4).
Figure 4. Scatter graph of targeted and output variables
Figure 5. Graph of models for prediction
After some tests, the model closest to the actual data was obtained (Figure 5-6), and with the help of it (3.MLP 18-4-1, pink) it was predicted that the number of cancer patients with reproductive organ disease in 2022 would be 583 people.
Figure 6. Graph of models for prediction (close-up)
Within the work, a study of the spread of cancer diseases dynamics in the Khabarovsk Territory was carried out; the initial processing of the source data was performed; the influence of various factors on the number of cancer patients was studied; and the predicted numbers for the year 2022 were obtained on the basis of the time series using neural networks.
The results obtained during the study will enable to assess the current situation in the implementation of primary and secondary prevention of cancer diseases, as well as to help organize additional measures aimed at preventing the emergence of the most common malignancies among the population of the Khabarovsk Territory.
References:
- Agapova E.G., Novikov Y.S. Multidimensional analysis of the quality of municipal services in the Khabarovsk Territory. Science of Krasnoyarsk. 2019. T. 8. № 5-3. P. 7-11.
- Bolshakov A.A., Karimov P.N. Methods of processing multidimensional data and time series. - M.: Hotline-Telecom, 2007. - 522 p.
- Bolshakova, L. V. Methodology for Using a Statistical Analysis Package for Correlation and Regression Analysis in the Course of Economic Research//Bulletin of Economic Security. – 2021. – № 3. - P. 259-265.
- Bondar D.A., Agapova E.G. Implementation of the educational discipline "Time Management" at the Pacific State University//Contemporary problems of science and education. – 2021. – № 5. - P. 9.
- Bondar D.A., Popova T.M. Statistical analysis of marriages by age of grooms and brides//In the collection: Far East Math. Materials of the student national scientific conference. Khabarovsk, - 2021. - P. 31-36.
- Bureeva N.N. Multidimensional statistical analysis using PPP "STATISTICA." Educational and methodological material on the advanced training program "Application of software tools in scientific research and teaching mathematics and mechanics." Nizhny Novgorod, - 2007, 112 p.