# Basic Statistical Analyses

*Calculation of Rates Per 100 000 Population*

The annual incidence rate per 100 000 population should be calculated for the total population, for the male and female populations separately, and for subgroups by age and sex, based on the number of persons who presented to hospital following a suicide attempt or self-harm in each calendar year.

Crude and age-standardized self-harm rates (including suicide attempts) should be calculated by dividing the number of persons who engaged in self-harm (n) by the relevant population figure (p) and multiplying the result by 100 000 – i.e. (n/p) x 100 000. Rates should be calculated on the basis of the number of persons resident in the relevant area who engaged in self-harm irrespective of whether they were treated in that area or elsewhere.

It is advisable to adjust for the age composition of the population under study as this ensures that differences observed by sex or by area are due to differences in the incidence of self-harm rather than differences in the composition of the populations.

For instance, European age-standardized rates (EASRs) are the incidence rates that would be observed if the population under study had the same age composition as a theoretical European population. EASRs can be calculated as follows: for each five-year age group, the number of persons who engaged in self-harm is divided by the population at risk and then multiplied by the number in the European standard population (5). The EASR is the sum of these age-specific figures. This can be done for different regions and also globally using the same method.

The following points should be kept in mind:

- Calculated rates that are based on fewer than 20 events may be unreliable
- If the same individual presents to the hospital more than once on the same calendar day, it should be clarified whether a second suicide attempt or act of self-harm has been made or whether the re-presentation is due to absconding and returning, or being transferred to another hospital. If no second suicide attempt or act of self-harm has been made, this should be recorded as a single suicide attempt or self-harm event.

*Confidence Intervals*

Confidence intervals provide a margin of error within which underlying rates may be presumed to fall on the basis of observed data. Confidence intervals assume that the event rate (n/p) is small and that the events are independent of one another. A 95% confidence interval for the number of events (n) is n +/- 2√n.

- For instance, if 25 acts of self-harm (including suicide attempts) are observed in a specific region in one year, the 95% confidence interval will be 25 +/- 2√25 or 15 to 35. Thus, the 95% confidence interval around a rate ranges from (n-2√n)/p to (n+2√ n)/p, where p is the population at risk. If the rate is expressed per 100 000 population, these quantities must be multiplied by 100 000.
- A 95% confidence interval may be calculated to establish whether the two rates differ in statistical significance. The difference between the rates is calculated. The 95% confidence interval for this rate difference (rd) ranges from rd-2√(n1/p12+n2/p22) to rd+2√(n1/p12+n2/p22). If the rates were expressed per 100 000 population, then 2√ (n1/p12+n2/p22) must be multiplied by 100 000 before being added to and subtracted from the rate difference. If zero is outside the range of the 95% confidence interval, the difference between the rates is statistically significant.

*Repeat Event Analysis*

Recording the unique person identification number allows for the analysis of repeated suicide attempts and self-harm acts. A repeated suicide attempt or act of self-harm should be defined as a re-presentation to any hospital due to a further suicide attempt or self-harm act undertaken after leaving the hospital following presentation for a previous suicide attempt or self-harm act. An example of repeat event analysis is conditional risk set analysis.