Analysing datasets and deriving visualisations from them is one thing; teaching journalists how to do it is quite another. In the past few months, we had been analysing datasets from different economic sectors in Nigeria, so when we learned about the workshop organised by NRGI at the School of Media and Communication for reporters in the oil and gas sector, we decided to venture into another: oil and gas dataset. We found this interesting one and decided to use it as case study for our session and the topic for this article.
For some time, our approach to data visualisation has alternated between explanatory and exploratory. Typically, we used the former when we had something specific to say about our findings. However, it was clear to us that, for a workshop which entailed helping data novices (as most of the participants were) to understand the relevance and techniques of data storytelling, we had to lean towards the latter.
Before going ahead to analyse the data, we thought it best to do a little recap of what we knew about the industry. Here are a few key ideas:
Oil and Gas in Nigeria
When was oil discovered in Nigeria?
Oil was discovered in Nigeria in 1956 at Oloibiri in the Niger Delta after 50 years of exploration. The discovery was made on Sunday 15 January 1956 by Shell D’Arcy (later known as Shell-BP). In 1958, Nigeria joined the ranks of oil producers when its first oil field came on stream producing 5,100 bpd.
How much oil is there in Nigeria?
Although Libya has more reserves, there were 37.2 billion barrels of proven oil reserves in Nigeria as of 2011, ranking the country as the largest oil producer in Africa and the 11th largest in the world, averaging 2.28 million barrels per day.
What is the Nigeria’s current dependency on oil?
Today, the oil and gas sector accounts for about 35% of gross domestic product, and petroleum exports revenue represents over 90% of total exports revenue.
How many oil exploration companies are there in Nigeria?
According to data from NRGI, there are about 220 oil companies operating in the country.
Which are the highest producing oil companies?
The Nigerian National Petroleum Corporation (NNPC) operates joint venture partnerships with six oil companies, namely, Dutch Royal Shell, Chevron, Exxon-Mobil, Agip, Total and Texaco (now merged with Chevron). Together, these joint ventures account for approximately 95% of all crude oil output, while local independent companies operating in marginal fields account for the remaining 5%.
After the usual preamble on the fundamental notions and principles of data (big data, open data, data formats etc.) we went right into it. At first, examining the large dataset was quite a daunting task for the participants; but we broke it down for them so that it became a little less intimidating. When it became quite clear that their eyes had begun to rest easy with the apparently chaotic jumble of numbers and text (which it really wasn’t since the dataset was actually structured), we went ahead to ask them the ‘question of questions’: what exactly do you want to find out from the data? In other words, we wanted to help them understand how to ask the right questions from their dataset.
There is a famous quote attributed to Albert Einstein which aptly captures the relevance of this exercise:
If I had an hour to solve a problem and my life depended on the solution, I would spend the first 55 minutes determining the proper question to ask. Albert Einstein
Thankfully, they got the idea. Questions began to drop thick and fast and we had to limit them to only four:
- Which companies are defaulting in emission and pollution taxes?
- What were the highest and lowest bonuses paid?
- How much was directly paid to the government?
- How much was paid as royalties?
It was a great start! Unfortunately, within the time available for the session, we could attempt to answer only the first one.
The next step was to carry out some basic data sorting and filtering using the spreadsheet. (We used Microsoft Excel but any other standard spreadsheet could have been sufficient). In answering the first question, this was the table we derived from the main dataset:
Emission and pollution taxes paid from 1999-2011
The table and chart gives you an insight into the above question. It makes it easy to identify who who paid what and when. The defaulters can also be pinpointed. Some other useful charts has been derived from the same datasets. Nine different charts for your use. Who knows, you might get insights and answers to some of the questions you have.