A PROJECT WITH CONDÉ NAST
THE CONTEXT
Condé Nast turned to the RES Group because they were looking for a partner to help them enhance the information assets of thousands of culinary recipes of their “La Cucina Italiana” brand.
These recipes, already well known by and appreciated in the sector, could be enriched with a large amount of other information such as nutritional values, food footprints, etc., to offer added value to customers, partners and end users of their digital platform.
Thus was the start of the “Smart Recipe Extractor”.
THE CHALLENGE
La Cucina Italiana, and other brands in the food sector of Condé Nast, are characterized by an abundant amount of recipes and textual data, which represent a long-lasting and highly valuable asset, as they are associated with primary needs and cultural aspects, such as nutrition and culinary traditions.
This data is largely digitized, as it has already been created with the aid of IT media or digitalized from historical paper archives.
The texts of the recipes are expressed in a number of natural languages, such as Italian, American or British English, French, German, Spanish and other idioms.
A portion of these texts had already been associated with some structured metadata, necessary for editorial purposes but the asset could be further enhanced
by associating other metadata, extractable from the various text sources themselves or from other information sources.
For this data enhancement activity, Condé Nast was looking for a partner who could develop and implement an end-to-end project following an agile methodology: RES Group made available its multidisciplinary team of experts in natural language processing (NLP – Natural Language Processing).
The RES team, composed of data scientists, data managers and business analysts, is able to follow the entire process, from the creation of the custom training corpus (in the main European languages - Italian initially) to the implementation of the relevant algorithms using the latest technological discoveries (deep learning).
THE SOLUTION
In a preliminary phase, a Proof of Concept was developed which highlighted the possibility of automatically, and with sufficient accuracy, returning the text of a recipe to the list of ingredients mentioned in it. In the same preliminary phase it was demonstrated that the algorithmically tracing of these ingredients back to predetermined taxonomic categories is achievable.
Thanks also to the results of the PoC, Condé Nast has given RES confidence and together we have created an end-to-end system capable of classifying, interpreting and extracting information from the recipes of La Cucina Italiana. It has been named the “Smart Recipe Extractor”.
Characteristics:
- Creation of manually annotated multilingual corpora
- Design of a stateless / functional micro-services architecture
- Implementation of a classification system based on NER (CRF Classifier)
- U Use of state-of-the-art approaches in the NLP area (e.g. BERT & ELMo in pre-training)
- Deployment of deep learning components with Amazon EKS (Elastic Kubernetes Service)
THE TESTIMONY
“The world of Italian excellence in the food sector and the increasingly fierce competition have led us to think about how to deal in a modern way and give value to our recipes, through data analysis and machine learning.
We needed a cutting edge technology partner, smart and able to direct us in the right choices, and with RES, we found what we were looking for. ”
Marco Viganò – Digital Chief Technology Officer Edizioni Condé Nast