fbpx

Matching and Clustering Core for Bimlib 2.0

Using machine learning algorithms, we created the heart of BimLib 2.0, a CAD-independent platform that helps to estimate the cost of construction projects

 

I would like to thank you for your contribution to the development of BIMLIB platform for the comprehensive predictive assessment based on neural network technologies. BIMLIB is pleased to be a partner of Apro and look forward to a futher fruitful cooperation.

Anton Reshetnikov

CFO, BIMLIB

Our Achievements

  • Object matching with accuracy 95% based on the object’s metadata
  • Development of model training mechanism and required API
  • Develop BIM class clustering algorithm

Technology

  • Python
  • ML / AI
clustering example

Introduction

Machine learning technology is literally reforging our way of living. It’s an incredible tool, capable of autonomously creating forecast models from raw data.

We took advantage of its algorithms to develop the core of Bimlib 2.0, a CAD-independent platform that helps to estimate the cost of construction projects.

 

What the client wanted

The client asked to develop the core of the product based on ML/AI technologies.

The system should have been able to do a matching of objects based on their description, metadata, and attributes.

We have been also requested to develop API for training and usage of the model on top of the core.

 

The Apro team performance

The ML core developed by APRO allowed to create the BimLib 2.0 product, which can update data on the cost and availability of products within a few minutes.

Before, this operation was done by a team of several people within some weeks.

Using machine learning algorithms, the system can also search and compare the catalogs of various companies and combine the manufacturers’ catalogs with the suppliers’ catalogs.

The project was managed by two professionals: one project manager and one developer.

 

What is Bimlib?

Bimlib 2.0 is a digital platform for the quick calculation of building materials and equipment costs. This system extracts metadata from your building project created with any CAD solution.

Based on machine learning algorithms, Bimlib 2.0 connects the specification of materials with the Bim catalog.

In the same way, also the catalogs of building materials sellers are matched with the Bim catalog.

As a result, building costs are calculated in a semi-automatic mode just in a few minutes.

 

Which features we implemented

The model solves the problems of placement, storage, and access to the tree of products through the API, taking into account the non-trivial structure of the model.

The main elements of the model

  • Classes: groups of products, united on the basis of related categories. An example: lamps.
  • Categories: specific product groups, e.g. table lamps.
  • Attributes: characteristics of a specific product item, e.g. supply voltage.

 

The model’s characteristics

  • Extensibility: The model can be extended by adding new categories and attributes without any limits.
  • Dynamism: The model has a dynamic class structure. The class structure of the model is updated each time the model is updated.
  • Uniqueness: The model eliminates the duplication of classes, categories, and attributes of products and, as a result, of the products themselves. This is achieved by the intellectual clustering of product items.
    New classes, categories, and attributes can be added to the model only if the clustering of a certain product is not possible for the current state of the model.
  • Flexibility: The model has a flexible structure and does not contain predefined immutable elements, due to this, it allows the placement of any items.

 

Data clustering

The clustering of data provides the extension/update of a generic data model with new classes, categories, product attributes, and the relationships between them.

Information about the products themselves is placed in the relational database associated with the model.

The clustering process includes several stages:

  • Normalization of the names of the attributes, in which the description of each attribute (length, height, width, etc.) is transformed to the normal form (the standard or most general form within this attribute), This process takes into account all the existing model attributes.
  • Product placement in the model according to the normalized attributes, taking into account all existing categories of the model. In this case, the placement occurs in several of the most probable branches of the product tree.
    After that, a nonlinear concatenation of the results is performed, and as a result, the preferable branch is determined.
  • Assignment of an object to certain classes of products based on the similarity of their attributes and attributes groups, taking into account all classes of the model.

Decisions about the need to expand the model due to new attributes, categories, or classes are made in this module. Decisions for adding each element of the model are made independently.

The assessment of the belonging of attributes, categories, and product classes, and model elements is carried out due to the probabilistic ranking algorithm BM25.

Based on the same algorithm, checks of the duplicates are performed, including the presence of implicit duplicates, which ensures the uniqueness of the model.

Improving the quality of the clustering result is achieved through the use of the Stable Matching algorithm (Nobel Prize in Economics, 2012).

flowchart

Figure 1 shows:

N  – BimLib number of classes;

[a] – Stable matching algorithm, “Stable matching: Theory, evidence, and practical design”,

https://www.nobelprize.org/uploads/2018/06/popular-economicsciences2012.pdf

Do you want to have the same experience?

Talk with one of our experts today to learn how we can help you scale your development efforts or create a custom application.