Correct identification of ethnicity is central to many epidemiologic analyses. Unfortunately, ethnicity data are often missing. Successful classification typically relies on large databases (n > 500,000 names) of known name-ethnicity associations. We propose an alternative naïve Bayesian strategy that uses substrings of full names. Name and ethnicity data for Malays, Indians, and Chinese were provided by a health and demographic surveillance site operating in Malaysia from 2011-2013. The data comprised a training data set (n = 10,104) and a test data set (n = 9,992). Names were spliced into contiguous 3-letter substrings, and these were used as the basis for the Bayesian analysis. Performance was evaluated on both data sets using Cohen's κ and measures of sensitivity and specificity. There was little difference between the classification performance in the training and test data (κ = 0.93 and 0.94, respectively). For the test data, the sensitivity values for the Malay, Indian, and Chinese names were 0.997, 0.855, and 0.932, respectively, and the specificity values were 0.907, 0.998, and 0.997, respectively. A naïve Bayesian strategy for the classification of ethnicity is promising. It performs at least as well as more sophisticated approaches. The possible application to smaller data sets is particularly appealing. Further research examining other substring lengths and other ethnic groups is warranted.
On July 17, 2014, Malaysia Airlines flight MH17 was shot down, a tragedy that shocked the Dutch population. As part of a large longitudinal survey on mental health in pregnant women that had a study inclusion period of 19 months, we were able to evaluate the possible association of that incident with mood changes using pre- and postdisaster data. We compared mean Edinburgh Depression Scale (EDS) scores from a group of women (n = 126 cases) at 32 weeks' gestation during the first month after the crash with mean scores from a control group (n = 102) with similar characteristics who completed the EDS at 32 weeks' gestation during the same summer period in 2013. The mean EDS scores of the 126 case women in the first month after the crash were significantly higher than the scores of 102 control women. There were no differences in mean EDS scores between the 2 groups at the first and second trimesters. The present study is among the first in which perinatal mental health before and after the occurrence of a disaster has been investigated, and the results suggest that national disasters might lead to emotional responses.
Despite the high prevalence of diabetes mellitus, little is known about mortality associated with diabetes in Asia. Therefore, the authors followed 3,492 Chinese, Malay, and Asian Indian adults randomly selected from the general population in Singapore. Data on glucose tolerance, demographic characteristics, and other cardiovascular disease risk factors (lipid profile, blood pressure, smoking status, alcohol consumption, and obesity) were obtained in 1992. Vital status was determined as of December 31, 2001. There were 108 deaths over a period of 9 years. Impaired fasting glycemia or impaired glucose tolerance (IFG/IGT) (hazard ratio (HR)=1.39, 95% confidence interval (CI): 0.84, 2.31) and diabetes mellitus (HR=2.49, 95% CI: 1.58, 3.94) were associated with increased mortality after adjustment for age, gender, ethnic group, and educational level. Compared with Chinese with diabetes, Indians with diabetes experienced significantly greater mortality (HR=3.86, 95% CI: 1.76, 8.44) after adjustment for gender, age, educational level, smoking, hypertension, alcohol intake, and obesity. Undiagnosed diabetes and IFG/IGT were more common than known diabetes and also were associated with increased mortality. For reduction of mortality associated with IFG/IGT and diabetes, the authors recommend a screening program to detect undiagnosed diabetes and IFG/IGT along with aggressive treatment of diabetes after diagnosis.
Mothers' recall data collected in Malaysia in 1976-1977 are analyzed to study correlates of mortality of 5471 infants. Respondent population is 1262 women living in 52 primary sampling units of Peninsular Malaysia. Lengths of unsupplemented and supplemented breastfeeding and presence of piped household water and toilet sanitation are related to infant mortality in regressions that also control other correlates. The analysis is disaggregated into three periods of infancy. Through six months of feeding, unsupplemented breastfeeding is more strongly associated with fewer infant deaths than is supplemented breastfeeding. Type of sanitation is generally more strongly associated with mortality than is type of water supply. The effects of breastfeeding and the environmental variables are shown to be strongly interactive and to change systematically during the course of infancy. Breastfeeding is more strongly associated with infant survival in homes without piped water or toilet sanitation. In homes with both modern facilities, supplemented breastfeeding has no significant effect, and unsupplemented breastfeeding is statistically significant only for mortality in days 8-28. Presence of modern water and sanitation systems appears unimportant for mortality of infants who are breastfed without supplementation for six months.
The effect of toilets, piped water, and maternal literacy on infant mortality was analyzed using data from the Malaysian Family Life Survey collected in 1976-1977. The effect of toilets and piped water on infant mortality was dependent on whether or not mothers were literate. The impact of having toilets was greater among the illiterate than among the literate, but the impact of piped water was greater among the literate than among the illiterate. The effect on the infant mortality rate for toilets decreased from 130.7 +/- 17.2 deaths in the absence of literate mothers to 76.2 +/- 25.9 deaths in the presence of literate mothers. The reduction in the mortality rate for maternal literacy dropped from 44.4 +/- 14.1 deaths without toilets to -10.1 +/- 23.9 deaths with toilets. Reductions in mortality rates for piped water increased from 16.7 +/- 12.7 deaths without literate mothers to 36.8 +/- 21.0 deaths with literate mothers. Similarly, reductions in the mortality rate for maternal literacy rose from 44.4 +/- 14.1 deaths in the absence of piped water to 64.5 +/- 19.5 deaths in the presence of piped water. The results from a logistic model provided inferences similar to those from ordinary least squares. The authors infer that literate mothers protect their infants especially in unsanitary environments lacking toilets, and that when piped water is introduced, they use it more effectively to practice better hygiene for their infants.
Analysis of mothers' recall data collected in 1976-1977 by a probability survey in Peninsular Malaysia shows an association between breastfeeding up to six months of age and improved survival of infants throughout the first year of life. Inappropriate sample selection and inadequate control of confounding can introduce large biases in these analyses. The magnitude and direction of these biases are presented. Even when these biases are dealt with, unsupplemented breastfeeding appears more beneficial than supplemented breastfeeding. The younger the infant and the longer the breastfeeding, the greater the estimated benefits in terms of deaths averted. The use of powdered infant formula did not appear to offset the detrimental effects of early weaning and supplementation. The positive relationships found in these analyses between breastfeeding and survival are not due to death precluding or terminating breastfeeding. Nor are they likely to be due to a shift away from breastfeeding because of recent illness, which was also controlled in the analyses. Nor are they likely to be due to other factors that both increase mortality risk and shorten breastfeeding; when such factors are taken into account, the beneficial effects of breastfeeding become stronger and imply that, if there had been no breastfeeding in this sample, twice as many babies would have died after the first week of life.
One hundred and ninety hepatitis B surface antigen positive (HBsAG+) sera were subtyped, belonging to : blood donors, hepatitis patients, patients and staff in a hemodialysis unit, all from Kuala Lumpur; Malaysian aborigines from three jungle locations in Peninsular Malaysia; and East Malaysians from Sarawak, East Malaysia; Three subtypes adr, adw and ayw were present in Malaysia in the following frequencies: 44%, 29%, and 27%, respectively; In Kuala Lumpur 87% had subdeterminant d and 13 per cent y, whereas in the deep jungle aborigines of Perak and Pahang, the y subdeterminant was present in 87% and the d in 13%. A similar pattern of preponderance of y prevailed in Sarawak, East Malaysia. In Kuala Lumpur the two main ethnic groups, Malays and Chinese, differed in subtype distribution, in that adr predominated in the Malays (61%), while the adw predominated in the Chinese (51%); Subtype distribution was not related to age or sex of carriers of the antigen, or to whether they had hepatitis, or asymptomatic antigenemia.
Genome-wide association studies (GWAS) have identified over 100 single nucleotide polymorphisms (SNPs) associated with prostate cancer. However, information on the mechanistic basis for some associations is limited. Recent research has been directed towards the potential association of vitamin D concentrations and prostate cancer, but little is known about whether the aforementioned genetic associations are modified by vitamin D. We investigated the associations of 46 GWAS-identified SNPs, circulating concentrations of 25-hydroxyvitamin D (25(OH)D), and prostate cancer (3,811 cases, 511 of whom died from the disease, compared with 2,980 controls-from 5 cohort studies that recruited participants over several periods beginning in the 1980s). We used logistic regression models with data from the National Cancer Institute Breast and Prostate Cancer Cohort Consortium (BPC3) to evaluate interactions on the multiplicative and additive scales. After allowing for multiple testing, none of the SNPs examined was significantly associated with 25(OH)D concentration, and the SNP-prostate cancer associations did not differ by these concentrations. A statistically significant interaction was observed for each of 2 SNPs in the 8q24 region (rs620861 and rs16902094), 25(OH)D concentration, and fatal prostate cancer on both multiplicative and additive scales (P ≤ 0.001). We did not find strong evidence that associations between GWAS-identified SNPs and prostate cancer are modified by circulating concentrations of 25(OH)D. The intriguing interactions between rs620861 and rs16902094, 25(OH)D concentration, and fatal prostate cancer warrant replication.
The role of hormonal factors in the etiology of lymphoid neoplasms remains unclear. Previous studies have yielded conflicting results, have lacked sufficient statistical power to assess many lymphoma subtypes, or have lacked detailed information on relevant exposures. Within the European Prospective Investigation Into Cancer and Nutrition cohort, we analyzed comprehensive data on reproductive factors and exogenous hormone use collected at baseline (1992-2000) among 343,458 women, including data on 1,427 incident cases of B-cell non-Hodgkin lymphoma (NHL) and its major subtypes identified after a mean follow-up period of 14 years (through 2015). We estimated hazard ratios and 95% confidence intervals using multivariable proportional hazards modeling. Overall, we observed no statistically significant associations between parity, age at first birth, breastfeeding, oral contraceptive use, or ever use of postmenopausal hormone therapy and risk of B-cell NHL or its subtypes. Women who had undergone surgical menopause had a 51% higher risk of B-cell NHL (based on 67 cases) than women with natural menopause (hazard ratio = 1.51, 95% confidence interval: 1.17, 1.94). Given that this result may have been due to chance, our results provide little support for the hypothesis that sex hormones play a role in lymphomagenesis.