IC Chip Logo Identification

Integrated circuit (IC) chips form the backbone of the electronics industry; and like any other high-power industry, component manufacturers are fluid, constantly changing, dividing, and merging ownerships. For this reason, successful IC startups or spinoffs will often have IC components that are still sold and used on modern circuit boards despite the company itself not having a large online presence. Needless to say, this can make tracking down individual components difficult.

Logos

IC components are small. Many of the most popular components (resistors, capacitors, etc.) rarely have manufacturer information. Even the most basic information, like company logos or part numbers can be missing. Even when the part numbers are present, it may be difficult to track down which manufacturer created and sold the part, making part replacement nearly impossible.

Logos for each brand are double-checked on various websites to avoid incorrect data labels.

Dataset

The dataset collected for this project comprises over 2,100 hand-picked and hand-labeled images spanning more than 60 component manufacturers. Although still growing, the dataset includes companies like Motorola, Acer, Sanyo, Texas Instruments, and many more. Images were collected from mass sales websites like Ebay, Alibaba, and IndianMart, as well as images websites like Google Images, Wikimedia, and TheRetroWeb.

Still unders construction, the model requires each dataset to contain at least 20 images to be considered in its training set. As of writing, the largest manufacturer dataset contains 65 images (Toshiba), while the smallest manufacturer dataset are only 1 image (RCWL).

The bar chart below plots the number of images for each manufacturer as of the time of writing (1/20/2026). The value on top of each bar gives the number of images for a specific manufacturer, and the name of the respective manufacturer is provided below each bar. The red dashed line illustrates the minimum 20-image cutoff. For instance, Fujitsu, Acer, Trident, and 3DFX each contain 20 images, but “ir” (International Rectifier) contains 10 images, while Harris currently contains only six images.

IC chip manufacturers — Images collected per chip manufacturer.

An Incorrect Solution

The model used for this study compares a variety of transfer learning models, including ResNet, EfficientNet, DenseNet, and others. For each model, the base layers remain frozen while the later classifications become flexible to accomodate the 60+ classes of data manufacturers.

Results

Once the data are split, each training set is run through an array of different models, including ResNet, EfficientNet, MobileNet, and DenseNet. Inference is then run on each trained model to find Test Accuracy and the Test F1-score. Results are illustrated in the table below.

Of the models utilized, EfficientNetB7 produced the best results with a test accuracy of 75.91%. Given that the model is being asked to classify between 65 unique manufacturers, this is well above random guessing (1.5% probability of a correct guess. Beyond the model’s performance, it is worth asking how confident the model is when making its decisions — this is illustrated in the plots below.

The plot illustrates the confidence score of the model when it correctly guesses a manufacturer (green bars) versus incorrect guesses (red bars). With a few exceptions, many of the correct guesses come with a high confidence score, typically greater than 80%. When guesses are incorrect, the confidence is very low (~20%).

Confidence scores — Confidence score frequency

Clearly, the model is picking up on image details that lead it to make correct predictions. But what details is it picking up on?

In the beginning of our study, we hypothesized that the model would pick up on the manufacturer’s logo — unique shapes would give rise to geometric features the model would learn. But by looking at attention heatmaps from successful predictions (below), we find this is not the case.

Attention map analysis of components with successful predictions. Images show the original component picture (background), attention map analysis (middle ground), and green circles and arrows depicting the component logo.

The above image is a composite image showing the original component image overlayed with a semi-transparent heatmap. Locations where the heatmap is red indicate a greater focus of attention in the image by the computer vision model; blue and green portions indicate very low attention. Inscribed on top of these are lime green rectangles with arrows pointing to them. The rectangles were added after the attention maps were generated to clearly illustrate the location of the logo we hypothesized the model would pick up on.

Looking at the rectangles, though, we see this is clearly not the case. In each example, the Qualcomm, Geode, Vishay, Intel, and Motorola logos fall well outside the central point of the model’s attention! What, then, is it looking at?

In general, the model seems to have found a shortcut, and has begun picking out patterns in the stucture of the model numbers themselves, and not the company logo. Keep in mind that each image in each dataset is explicitly chosen to be different in some way. For example, each chosen component within a manufacturer dataset contains a different part number, and is likely taken with a different background, from a different angle, with different lighting. It is interesting, then, that these are the details it picks up on to achieve its 76% test accuracy! In the case of the SOIC (final image on the right), it seems to almost be identifying the shape and number of nodes connecting the component to the circuit board.

A better solution

Unlike the single-model pipelines used to predict package type and locate component damage, predicting component manufacturer entails solving two problems: 1) identifying the logo on the component, and 2) classifying the logo as the correct brand. To solve a two-problem situation, we need a two-model pipeline

Logo Location

The first model utilizes You Only Look Once (YOLO) image segmentation algorithms. Following the tragic performance of the previous model, each image in the dataset was annotated using a Jupyter-based graphical user interface (GUI), making it efficient to process logos of all 2,800 images.

Example of an annotations and label for one circuit component

Each annotation is saved in a text file, and includes information on class ID, the (X,Y) coordinate of the box’s center pixel, and the width and height of the green box surrounding the logo. These annotation files act as the “truth” data for each associated training image.

As the shown in this illustrated from “Object detection in real time based on improved single shot multi-box detector algorithm“, each iteration of the YOLO model calculates (and seeks to maximize) the Intersection over Union (IoU) metric for a multitude of boxes within the image. For each box, the model calculates the overlap in area between the ground truth and the current box — the more a box in the image overlaps with the ground truth, the higher the IoU value,

Diagram illustration of Intersection over Union in computer vision models.

As the metric is driven higher by box size and location, a model is developed to identify and enclose a component’s logo. Instances of identified logos are illustrated below.

A company’s logo is identified on the (left) original image, extracted, and (right) cropped.

Combined with the label (manufacturer name) for each image, the extracted logo crops form a second dataset used to train the subsequent model.

Manufacturer Classification

The second models is trained on the extracted logo crops to classify the logo into one of the 70+ unique manufacturing brands. Unlike the logo identification model, training data for classification is divided into three sets for (1) training, (2) validation, and (3) held-out test data.

Similar to whitetail deer aging models, EfficientNet-B7 (EB7) was chosen as the classification transfer learning model. Comprising 66 million parameters, EB7 achieved 97.15% test accuracy. The model’s confusion matrix, illustrated below, emphasizes the strong performance via a strong diagonal.

Confusion matrix of the logo classification model.

as well as very high confidence scores amongst correct predictions:

Confidence scores of the logo classification model.

The right plot is especially telling — many of the correct predictions were made with a confidence well over or around 85%, showing the model’s strength in characterizing the logo images chosen by the first logo identification model.

Conclusion

The project to utilize Computer Vision (CV) models in classifying logo manufacturers on individual components was an interesting one. The preliminary hypothesis was that a single CV model could be used to locate, characterize, and correlate logos on electronic components to determine the correct vendor. However, attention map analysis of the highest performing models showed that this was not the case.

Instead, a two-model approach was used. The pipeline comprises two models: one to identify and capture the logo, and a second model to characterize the extracted logo and predict the part’s manufacturer. Although not perfect, the two-model approach performs very well, demonstrating >97% test accuracy on held-out data across 70+ manufacturers. Although more data will need to be added in the future, the first model will not need to be rerun! Instead, only the second model will need to be retrained as a wider variety of manufacturers are added.

Needless to say, this project was eye opening, and has far reaching implications in defense, part identification, and other sectors as well. If you’re interested in the power of single-model CV pipelines, don’t forget to check out my whitetail deer agining study, alive and well on AgeMyDeer!

Logos

Dataset

An Incorrect Solution

Results

A better solution

Logo Location

Manufacturer Classification

Conclusion

Related

Submit a Comment Cancel reply

Recent Posts

Recent Comments