The constraint of two ordered extreme minima random variables when one
variable is consider to be stochastically smaller than the other one has been carried
out in this article. The quantile functions of the probability distribution have been
used to establish partial ordering between the two variables. Some extensions and
generalizations are given for the stochastic ordering using the important of sign of the
shape parameter.
The box plot has been used for a very long time since 70s in checking the existence
of outliers and the asymmetrical shape of data. The existing box plot is constructed
using five values of statistics calculated from either the discrete or continous data. Many
improvement of box plots have deviated from the elegant and simplier approach of exploratory
data analysis by incorporating many other statistic values resulting the turning
back of the noble philosophy behind the creation of box plot. The modification using
range value with the minimum and maximum values are being incorporated to suit the
need of selected discrete distribution when outliers is not an important criteria anymore.
The new modification of box plot is not based on the asymmetrical shape of distribution
but more on the spreading and partitioning data into range measure. The new propose
name for the box plot with only three values of statistics is called range-box plot.
Recent studies have shown that independent identical distributed Gaussian
random variables is not suitable for modelling extreme values observed during extremal
events. However, many real life data on extreme values are dependent and stationary
rather than the conventional independent identically distributed data. We propose a stationary
autoregressive (AR) process with Gumbel distributed innovation and characterise
the short-term dependence among maxima of an (AR) process over a range of sample
sizes with varying degrees of dependence. We estimate the maximum likelihood of the
parameters of the Gumbel AR process and its residuals, and evaluate the performance
of the parameter estimates. The AR process is fitted to the Gumbel-generalised Pareto
(GPD) distribution and we evaluate the performance of the parameter estimates fitted
to the cluster maxima and the original series. Ignoring the effect of dependence leads to
overestimation of the location parameter of the Gumbel-AR (1) process. The estimate
of the location parameter of the AR process using the residuals gives a better estimate.
Estimate of the scale parameter perform marginally better for the original series than the
residual estimate. The degree of clustering increases as dependence is enhance for the AR
process. The Gumbel-AR(1) fitted to the threshold exceedances shows that the estimates
of the scale and shape parameters fitted to the cluster maxima perform better as sample
size increases, however, ignoring the effect of dependence lead to an underestimation of
the parameter estimates of the scale parameter. The shape parameter of the original
series gives a superior estimate compare to the threshold excesses fitted to the Gumbel
distributed Generalised Pareto ditribution.
The source of gastrointestinal bleeding (GIB) remains uncertain in patients presenting without hematemesis. This paper aims at studying the accuracy, specificity and sensitivity of the Naive Bayesian Classifier (NBC) in identifying the source of GIB in the absence of hematemesis. Data of 325 patients admitted via the emergency department (ED) for GIB without hematemesis and who underwent confirmatory testing were analysed. Six attributes related to demography and their presenting signs were chosen. NBC was used to calculate the conditional probability of an individual being assigned to Upper Gastrointestinal bleeding (UGIB) or Lower Gastrointestinal bleeding (LGIB). High classification accuracy (87.3 %), specificity (0.85) and sensitivity (0.88) were achieved. NBC is a useful tool to support the identification of the source of gastrointestinal bleeding in patients without hematemesis.
Left-truncated and censored survival data are commonly encountered in medical studies. However, traditional inferential methods that heavily rely on normality assumptions often fail when lifetimes of observations in a study are both truncated and censored. Thus, it is important to develop alternative inferential procedures that ease the assumptions of normality and unconventionally relies on the distribution of data in hand. In this research, a three parameter log-normal parametric survival model was extended to incorporate left-truncated and right censored medical data with covariates. Following that, bootstrap inferential procedures using non-parametric and parametric bootstrap samples were applied to the parameters of this model. The performance of the parameter estimates was assessed at various combinations of truncation and censoring levels via a simulation study. The recommended bootstrap intervals were applied to a lung cancer survival data.
A boxplot is an exploratory data analysis (EDA) tool for a compact visual display of a distributional summary of a univariate data set. It is designed to capture all typical observations and displays the location, spread, skewness and the tail of the data. The precision of some of this functionality is considered to be more reliable for symmetric data type and thus less appropriate for skewed data such as the extreme data. Many observations from extreme data were mistakenly marked as outliers by the Tukey’s standard boxplot. A new boxplot implementation is presented which adopts a fence definition using the extent of skewness and enhances the plot with additional features such as a quantile region for the parameters of generalized extreme value (GEV) distribution in fitting an extreme data set. The advantage of the new superimposed region was illustrated in term of batch comparison of extreme samples and an EDA tool to determine search region or direction as contained in the optimisation routines of a maximum likelihood parameter estimation of GEV model. A simulated and real-life data were used to justify the advantages of the boxplot enhancement.