As with many expressions nowadays, "big data" often is tossed around but rarely understood. How many safety leaders fully grasp what big data is or how it is relevant in the safety industry?
Big data has been defined as "an accumulation of data that is too large and complex for processing by traditional database management tools." Though this definition may seem rather vague, there are a few key components worth delving into in the pursuit of understanding the concept.
First, let's take a look at the phrase "an accumulation of data that is too large and complex…" Again, do not be alarmed by the broadness of this portion of an already broad definition.
In big data, "large and complex" actually is quite intuitive. For example, an Excel file with 10 rows and 2 columns would not be considered "large or complex" by most professionals today. However, 30 years ago, before the average office employee was at least somewhat familiar with spreadsheet tools, a 10 x 2 sheet may have been a source of panic. Thus summarizes the first curious point about big data: It's all relative.
The second portion of the definition drives home this theory of big data relativity: "…for processing by traditional database management tools." Today, the average computer has approximately 8GB of ram. This is more than adequate to handle a spreadsheet size of the above example.
In fact, today's various types of spreadsheet software can handle over 1 million rows and 16,000 columns. Although the reality remains that most modern laptops will slow to a frustrating crawl if you attempted to open a workbook of that size, the software theoretically does have the capability, and is therefore technically not incompetent.
Again, the inadequacy of the applications – and consequently, the second half of the definition of big data – are in the eye of the beholder.
So, we see that relativity is an important factor in determining what big data is. But relativity is not a clear and concise definition, and if you're a mathlete and student of logic, you're probably begging for a more precise definition. The research company Gartner coined another term for big data as "The 3 Vs." They are as follows:
Volume: Why Is Big Data Relevant?
Thanks to Gartner, we finally can delve into how big data is relevant in workplace safety, starting with volume. This is fairly self explanatory. In order for data to be classified as big data, there needs to be lots and lots of it.
One of the best examples in the safety world is the number of inspections, and the more the merrier. As safety professionals gravitate more towards behavior-based safety-type (BBS) inspections versus inspections based on pure compliance, the number of "items" observed on worksites is increasing. In BBS, we no longer are just making sure that the guard on the saw is up to code, but also that the worker using that particular saw is utilizing the correct PPE and that are he or she is operating that machine correctly. By observing more and more activity on job sites, more and more data is generated.
Running on the assumption that safety professionals are collecting and reporting on that data rather than just filing it in a cabinet somewhere, more data is a large part of more revealing reporting. Additionally, a best practice in BBS includes a more diverse set of workers performing these inspections. That is, where once only safety professionals observed the workforce, anyone now is capable to observe safe or at-risk behavior among any of our colleagues. Today, more areas of industry are observed by more people coming from more backgrounds than in days past. As a consequence, the volume of data collected surpasses that of just "data" and graduates into "big data."
Variety: Methods of Data Collection
The next "V" is "variety." Variety indicates the different methods in which data is collected. Not too long ago, standard practice for safety inspections was to rely on good, old-fashioned pencil and paper to complete fixed checklists. In fact, that likely remains standard practice for many of you reading this.
Conversely, thanks to advancements in technology, we now enjoy the ability to perform inspections via mobile devices and tablets. What used to be primarily analogue now is assimilating nicely into the digital world.
Due to the different types of mobile devices at our disposal, the "variety" of our data has increased dramatically. Sensor technology – such as those used on production lines and forklifts – and the Internet also have changed the game of data collection. Think of them more as streams of data, taking on a new "form," departing from the paper-based checklists of yesteryear. Images and video are other formats of information we collect on a daily basis, all geared at understanding where the risk is on our job sites.
Velocity: The Speed of Data Collection
The final "V" is "velocity." In safety, v ≠ d/t, but is a measurement of how quickly our data is being collected. The advancement of technology has given us many of the finer things in life, such as information always at our fingertips. Information is available to us as quickly as we can type a search into Google.
This phenomenon is replicated in many business intelligence (BI) tools currently on the market. Gone are the days when we have to email someone in IT to run a report for us, only to wait a week or more to receive it. Today, a foreman can observe the worksite, enter that information into his phone and within minutes or even seconds, log into a BI suite from a web browser and see the high areas of risk in the business. The speed at which data both is collected and analyzed is ever increasing, and information is available in a near real-time manner.
The safety industry, with a few slower-to-catch-on exceptions, meets all the criteria of the three Vs of big data. As technology advancement steadily intensifies, we'll see a relative advancement towards the three Vs by those who haven't hit this threshold yet.
The Right Tools
If we return to the idea of relativity, specifically, "traditional data processing applications are inadequate," what good is data if we don't have the tools to analyze it? If your workforce is generating 1,000 inspections per month across each of your job sites, appending images of at-risk behavior and penning long descriptions of the antecedents, how can you ingest and digest all of that information via a spreadsheet?
If the data you're collecting is at the cutting edge of technology, but the tools used to analyze that data are from the Stone Age, there is a weakest-link effect happening in your safety program. That is, you have a gold mine of information but no way to effectively evaluate it. It's time for your systems to become as smart as your data.
These smart systems need to be able to interpret and analyze the information as swiftly as it comes in, as well as keep up with the increasing size and variety of data sources.
BI tools are more adequate than ever to handle the amount of data a safety department can generate. Sadly, more often than not, many still are stuck with legacy IT reporting tools and spreadsheet-based reports, both of which are slow and ineffective at turning data into insight.
Another area smart systems need to cover is the ability to uncover insights for us. That is, instead of reviewing reports and looking for risk, then interpreting that information into actionable items, our tools should tell us where to find the risk. Then, safety professionals can spend their time understanding how to fix that risk before it becomes a potential harm to a human worker.
The world of safety is not lacking in data; in fact it's drowning in it. It's time for a change in how we keep our workers safe and able to go home each night to their families. That change starts with smart data systems that allow us to apply our expertise in the most efficient way possible.
Let us no longer bury ourselves in IT and data-related items, trying to make heads or tails of the mountains of information pouring in every day, but utilize the smarter tools that already are available to reach smarter outcomes with our big data.
Nick Bernini is the manager of predictive analytics at Predictive Solutions Corp. Bernini can be reached at [email protected] Tara Bachy is a support analyst at Predictive Solutions and can be reached at [email protected]