These Three Datasets Measure Violence Trends

Fayren Chaerunnissa
5 min readMar 3, 2023

Around the world, there are many different forms of violence, ranging from petty crime all the way to violent extremism. Due to the high variety of types of violence, there is an urgent need to have comprehensive monitoring tools for violence and conflict across different national contexts to be able to properly measure and predict trends in violence to support preventive action.

The following are three violence databases with comprehensive data on violence and conflict over time.

Global Terrorism Database (GTD) by the National Consortium for the Study of Terrorism and Responses to Terrorism, Unviersity of Maryland (START UMD)

The Global Terrorism Database is an open-source database including information on terrorist events around the world from 1970 through 2020 (with additional annual updates for the future) from around the world. It was developed to enable researchers to increase understanding of the phenomenon of terrorism. It is specifically designed to be amenable to the latest quantitative analytic techniques used in the social and computational sciences.

The GTD includes systematic data on domestic as well as transnational and international terrorist incidents that have occurred during this time period and now includes more than 200,000 cases. Information in the GTD is drawn entirely from publicly available, open-source materials. These include electronic news archives, existing data sets, secondary source materials such as books and journals, and legal documents. The data is gathered from more than 4,000,000 news articles and 25,000 news sources that were reviewed to collect incident data.

From the database, information is available on the date and location of the incident, the weapons used and nature of the target, the number of casualties, and — when identifiable — the group or individual responsible. Each case includes information on at least 45 variables, with more recent incidents including information on more than 120 variables. These variables include incident date, location, perpetrator group name, types of weapons used, amount of damage, number of fatalities and more.

This is a good dataset to answer my question because it contains comprehensive information about the details of every single terrorism incident, which is a specific form of violence. With the information provided, researchers and policymakers are able to utilize this data to design and implement terrorism preventive measures. However, this dataset also has its drawbacks. One of them would be that there may be issues with data consistency. According to the GTD website, there were some time periods when data was not collected in real time due to the unavailability of some media sources about certain incidents. They have, however, made improvements to their methodology to compile the database since 2012, so the database is still extremely useful to investigate recent cases (from the last decade or so).

The dataset is publicly available at https://www.start.umd.edu/gtd/

Bangladesh Peace Observatory (BPO) by the Center for Genocide Studies (CGS), Univesrity of Dhaka

The Bangladesh Peace Observatory is an open access data platform collected from different streams of publicly available data, mainly from news media in Bangladesh related to violence incidents. Furthermore, they also publish a bimonthly Peace Report to provide a comprehensive understanding of crime and violence in the country. The database was established in collaboration with UNDP.

The data was collected from open-source media news outlets. However, BPO also includes data from case-by-case interviews with witnesses of certain violent incidents that may not have been publicly reported in media outlets. For instance, the Cox’s Bazaar Rohingya refugee camp in southern Bangladesh has multiple violent incidents that journalists and media outlets don’t have access to report. Therefore, case-by-case interviews allow the BPO team to still obtain relevant case information to include in the database without having to rely on public media outlets.

The BPO categorizes the different types of violence that occur across Bangladesh. Their list of violence categories include the following: abduction, assault, clash, gunfight, mob violence, sexual assault, terror attacks, and more. Furthermore, the database also adds further details of the incidents by grouping them based on relevant themes such as COVID-19, electoral violence, Rohingya-related violence, and violent extremism, along with details like the actors and their motives.

This is a good dataset to investigate trends in violence because it has comprehensive details of each violent incident, which enables researchers to group the incidents based on several variables, including the actors, the motives, and its cross cutting with relevant themes currently going on in Bangladesh. The way the data was structured makes it easier for researchers to make conclusions on violence trends based on different contexts. However, an issue with the database is that the data is only from Bangladesh and not from other countries (which makes sense because it is their national database). Therefore, this database would not be the one for researchers to investigate global violence trends, although it could certainly contribute to some extent.

The dataset is publicly available at http://peaceobservatory-cgs.org/#/

The Collective Violence Early Warning (CVEW) Dataset, CSIS Indonesia

The Collective Violence Early Warning (CVEW) Dataset is a comprehensive monitoring tool and early warning system for collective violence and conflict in Indonesia. The data for the database was collected from newspapers and publicly available media sources regarding cases of violence across Indonesia at a local and provincial level. The database coded over 57 unique variables for every single violence incident, including the actors and motives for each incident.

Information regarding violence incidents is ensured to be collected from news sources with a high frequency and consistency of publications, as well as sources that cover a wide geographical area within Indonesia’s provinces. The factuality of the news is also ensured by only selecting valid news sources from the Press Council’s media list.

This database is good for me to answer my research question because it proves comprehensive details regarding each violence incident in Indonesia by grouping them by specific categories, like its actors, motives. It is also possible to conduct a more detailed analysis based on the geographical distribution of the crimes. However, one of the limitations of this database is that there may be missing details regarding some news incidents due to the sudden inaccessibility to some news sources. Furthermore, there were some incidents that were reported in the national media but not reported in the provincial news sources. Therefore, there may be issues with the consistency of the data.

The dataset is publicly available at https://violence.csis.or.id/

--

--

Fayren Chaerunnissa

life is a circus and i am the clown | Quantitative Methods in the Social Sciences Columbia University | https://github.com/fayrenheit