By David Bullock, Research Associate Professor
North Dakota State University
Agribusiness & Applied Economics Department
When you think about the term "big data," does corn come to mind?
Probably not. However, you may be surprised to learn that data and storable commodities, such as corn, have plenty in common. Before industrialization, commodities were consumed where they were produced. Therefore, the need for transportation infrastructure and storage capacity was quite limited.
However, as civilizations became more industrialized and populations moved to cities, the need for a large scale storage and transportation infrastructure became necessary to feed the growing urban populations. This motivated the development of the modern storage and logistics technologies that are prevalent in today's agricultural industries from farm to fork.
With data, the storage technology has evolved from the clay tablets of antiquity to the DNA-based computer drives of the future. Data transmission (transportation) technologies have evolved from human messengers on foot to the high-speed, fiber optic data transmission networks of today.
The exponential growth in the volume and scale of data production (as with commodities) has pushed the technological development curve forward rapidly to take advantage of the extraordinary amount of data available on the "cloud" and through other sources.
A stored agricultural commodity has intrinsic value that is directly proportional to the ability to convert the commodity into a consumable product. For example, corn that is stored in a farmer's bin in north central Iowa has intrinsic value related to its location (transportation), form (quality premiums/discounts) and time value (interest/carry value).
In addition, stored corn has intrinsic value related to its highest related usage. For example, it can be used as feed for the farm's farrow-to-finish operation, converted into ethanol at the plant 20 miles away or sold to the local elevator for eventual export to Japan. This intrinsic value of a commodity often is reflected in what is known as the basis value — the spread in price between a generic commodity represented by a standardized contract (futures contract) and the cash price received for a particular use (typically the highest valued use).
Note that without the ability to convert this commodity into a consumable product, it has no intrinsic value. For example, if a transportation strike occurred, the basis would widen substantially and greatly reduce the value of the corn (or at least limit the potential uses). In that case, having that farrow-to-finish hog enterprise as a hedge might be fortunate.
If we eliminate all potential uses for the corn in storage, it has no intrinsic value (except maybe for the psychological value that the farmer may have for the grain in the bin). In addition, corn does not have an infinite shelf life and, eventually, the value will decline to zero.
For some commodities, the shelf life is very short (for example, milk) and the need for an immediate convertible use is paramount. Part of the rationale for the U.S. milk market order system is to guarantee producers an orderly market for their milk, given the very short shelf life in its current form.
Like a commodity, stored data only derive value relative to the ability to convert them into a consumable product, which is primarily useful information and actionable insights. As with corn and milk, some data have a longer shelf life than other types.
Enter Big Data Analytics
Analytics is the process of converting data into information and insights. Essentially, it is where "the rubber hits the road" when it comes to big data. Unfortunately, it is also the one aspect that is hardest for most data users to grasp, understand and implement effectively.
Big data entrepreneurial efforts can fall flat due to poorly implemented storage and transmission technologies: the old "garbage in, garbage out" paradigm. However, in more cases than not, these business disasters primarily can be attributed to a poor implementation of analytics or even worse, a complete disregard for analytics.
Like the farmer who sells stored corn at the lowest price of the season and through the lowest valued marketing channel, the user of poor analytics derives minimal value in terms of the useful information and insights derived from the data. Even worse, poor analytics can result in suboptimal and even harmful decision making, leading to disastrous consequences for the user.
This is unfortunate because technological progress in the area of data analytics has advanced rapidly during the past 10-20 years. The analyst has new tools in the toolbox that were not available decades ago. Much of this advancement can be attributed to the increased processing power of modern computers (such as "data mining" algorithms) but also to the rapid advancement of statistical science, particularly in the area of multivariate statistics.
Most applications of big data analytics are driven toward answering one of the four basic questions that most users have with regard to using data.
- Descriptive analytics — What happened?
- Diagnostic analytics — Why did it happen?
- Predictive analytics — What will happen?
- Prescriptive analytics — What should I do?
A wide variety of computational, statistical and graphical tools can be employed to answer the four basic questions. These range from very basic graphical dashboards to highly sophisticated artificial intelligence (AI) applications.
Most farmers do not build bins with the intention of storing grain into eternity. They build bins with the intention of adding value to their crop by carrying it from a period of low value to a period of higher value, or by adding flexibility to their use of the commodity. Likewise, farmers should not store large amounts of data without having a plan to convert the data into a higher-valued product.
Analytics is the process of converting data into useful information and actionable insights that add real value to the data collection process. Before investing in data collection and storage technologies, agricultural producers should have a plan in place that uses analytics to extract useful value from the data collected so that the four basic questions — What happened? Why did it happen? What will happen? What should I do? — can be answered effectively.