Opposite to a widespread idea, the intelligence of Big Data does not reside in the tool but in the method. The first step is to clearly define your project. Too often, companies are going headlong into Big Data projects without knowing where they are going and what they are looking for. What human purpose do you want to give to your project? Will it serve to improve living conditions, do better with less in a concern for global ecology or create better collaboration between the different departments of your company? What added value are you awaiting from this project? What are your concrete objectives and what key performance indicators will you implement? By asking the right questions, your project will be more relevant and effective. You’ll be more creative and will be able to locate more easily the available data sources – in the data market places and open data – that you can cross with your own to obtain a leverage.
The success of a Big Data project also depends on the means we use. Start by dedicating an agile team to your project. Multidisciplinary and self-organized, it will cut the project into batches and work in short cycles validated each time by tests. Have data scientists accompany you in the long term. These data specialists will allow you, through their external vision and experience, to better define your project and its development, give meaning and consistency to your information. Consider integrating the Data Protection Officer from the project outset to ensure that it complies with the European Data Protection Regulation (GDPR). Do not neglect the importance of architectural choice either. Did you take into account the three V’s of the Big Data, i. e. speed, volume and variety of data? Will your IT infrastructure be able to maintain its functionality and performance during scalability? Is your IT system scalable? Can it work with other products, existing or future, without access or implementation restriction?
It is always better to prefer quality over quantity. The same applies to Big Data as well. Checking the quality of the collected data is essential. Avoid the GIGO (Garbage in Garbage Out) effect – if the input data is bad, the results will be bad too – and adopt a smart data approach that extracts from the huge mass of your data those that are most relevant to your project. In the same logic, during the processing phase, switch from ETL – Extract, Transform, Load – to ELT – Extract, Load, Transform. In other words, develop an operational data hub that will allow you to have a single 360° view of all your data, regardless of its source and format. Make sure that the processing phase remains flexible so that you can modify one of the project components if necessary (for example, new tools or a new way of thinking about the business has just been released). To save time and resources, decentralize processing and optimize algorithms instead of relying on a powerful supercomputer.
In any Big Data project, the end user must have a say. The data returned after processing will only have real added value if those for whom it is intended can access and use it easily. This is why it is important to integrate end-users in choices in terms of ergonomics, readability and data relevance. If these conditions are not met, your project will fail.
You will have understood from this article: the parameters to be taken into account for the success of a Big Data project are numerous and often difficult to set up without a specialist assistance. Think about it before you start. The tool is not enough, it is still necessary to have the right method.
Smart data in action: AdbA in collaboration with LISER is organising a conference and workshops on Thursday, November 16th. (limited seats)