Skip to main content

Accelerate Catalog Building with Attribute Extraction

Accelerate Catalog Building with Attribute Extraction header

If you spend hours pulling brand, size, and colour from product copy, you’re not alone. Data analysts in e‑commerce face a relentless flow of new items, each demanding clean, searchable attributes before a listing goes live.

You describe it

Attribute Extraction

1. Overview

This process reads a product’s textual description and any accompanying specification sheet, then pulls out three key product attributes—brand, size, and colour—into a clear, structured list.

2. Business Value

  • Consistent product data – Uniform attribute data improves search, filtering, and recommendation engines.
  • Faster catalog building – Automates a manual step, letting data analysts focus on higher‑value tasks.
  • Reduced errors – Standardised extraction limits inconsistencies caused by free‑form text.

3. Operational Context

  • When to run: Whenever a new product is introduced or an existing product’s description is updated.
  • Who uses it: Data analysts responsible for populating or maintaining the e‑commerce catalog.
  • Frequency: Typically run per product entry (i.e., once for each new or updated product record).

4. Inputs

Below are the items that must be supplied for a single run of this process.

  • Product Name – Text – The product’s commonly used name (e.g., “Nike Air Max 90 Men’s Shoes”).
  • Product Description – Text – The full marketing description or copy, typically a paragraph or a few bullet points, containing product details.
  • Specification Sheet – PDF Document – Optional PDF file that may contain additional details such as measurements, material, or colour codes.
Input NameTypeDetails Provided
Product NameTextThe official product name as it appears on the retailer’s site.
Product DescriptionTextFull text description of the product (plain text).
Specification SheetPDF DocumentOptional PDF containing specifications, measurements, and colour details (if available).

5. Outputs

The process yields a structured list of the three required attributes.

  • Extracted Attributes – A table containing the three attributes and their values.
  • Extraction Summary – A short paragraph summarising the extracted data (e.g., “The product is a Nike shoe, size 9.5, in black.”).
Output NameContentsFormatting Rules
Extracted AttributesA two‑column table (Attribute, Value) containing: * Brand, * Size, * ColourUse the exact attribute names shown above. Capitalise the first letter of each value (e.g., “Nike”, “9.5”, “Black”).
Extraction SummaryOne‑sentence summary of the extracted attributes.Sentence case, ending with a period. Use a neutral, professional tone.

6. Detailed Plan & Execution Steps

  1. Open the inputs – Retrieve the Product Name, Product Description, and, if supplied, the Specification Sheet.
  2. Identify the brand a. Scan the Product Name for a known brand name (see Appendix C – Brand List). b. If the brand is not obvious, search the internet for the product name and note the most prominent brand that appears.
  3. Identify the size a. Look for numeric patterns or standard size codes (e.g., “9.5”, “XL”, “12‑inch”) in the description and spec sheet. b. If more than one size is mentioned, list each size separated by commas.
  4. Identify the colour a. Scan the description and spec sheet for colour words from the colour list (Appendix C – Colour List). b. If multiple colours are present, list all colours in the order they appear, separated by commas.
  5. Record the values – Write the identified values into the “Extracted Attributes” table, preserving spelling and capitalisation.
  6. Compose the summary – Produce a one‑sentence paragraph that combines the three attributes in the order: brand, size, colour. Example: “Nike Air Max 90, size 9.5, black.”
  7. Validate the extraction (see Section 7).
  8. Save the outputs – Store the “Extracted Attributes” table and the “Extraction Summary” as the final deliverables.

7. Validation & Quality Checks

CheckDescription
Brand PresenceConfirm a brand value is present; if not, flag for manual review.
Size FormatVerify size follows a recognised pattern (numeric, “XS‑XL”, or measurement).
Colour ValidityEnsure each colour matches an entry from the Colour List (Appendix C).
ConsistencyCompare values found in the description and the spec sheet – they must not conflict.
CompletionAll three attributes (brand, size, colour) must be filled; missing values are flagged.

If any check fails, mark the record with an Error status and note the missing or mismatched attribute for manual review.

8. Special Rules / Edge Cases

  • Multiple Colours – List all colours in the order they appear; keep commas only.
  • Multiple Sizes – List all sizes; separate with commas (e.g., “9, 9.5”).
  • Unknown Brand – If a brand cannot be identified from the description, product name, or a quick web search, record “Not Found” and flag for manual verification.
  • Unrecognised Colour – If a colour word is not in the colour list, add “(Unconfirmed)” after the colour (e.g., “Gainsboro (Unconfirmed)”).
  • Conflicting Information – When the description and spec sheet disagree on size or colour, prefer the spec sheet; note the conflict in a comment field in the output table.
  • Missing Spec Sheet – If no PDF is supplied, rely solely on the product description.
  • No Size or Colour Mentioned – Record “Not Specified” for the missing attribute.

9. Example

Input

  • Product Name: “Adidas UltraBoost Running Shoes”
  • Product Description: “Experience the new Adidas UltraBoost. These shoes feature a breathable knit upper in black and white. Available in sizes 8, 9, 10, and 11. Made with Primeknit technology.”
  • Specification Sheet: PDF file (contains a table showing “Colour: Black/White”, “Size: 10”, “Material: Primeknit”).

Output

Extracted Attributes

AttributeValue
BrandAdidas
Size8, 9, 10, 11
ColourBlack, White

Extraction Summary

“Adidas UltraBoost running shoes, available in sizes 8, 9, 10, 11, in black and white.”

Appendix A – FAQ

Q1: What if the product description does not contain a colour? A: Record “Not Specified” for the colour field, then flag for manual review.

Q2: The product name includes multiple brand names (e.g., “Nike x Adidas”). A: List both brands separated by a slash (e.g., “Nike/Adidas”).

Q3: I find a colour word that isn’t in the colour list. A: Include the word as‑is, add “(Unconfirmed)”, and note the situation in the “Comments” column of the output table.

Q4: How do I handle a size given as “EU 42”? A: Record the size exactly as shown (“EU 42”) and ensure it is placed under the Size column.

Q5: The spec sheet lists a colour as “#000000”. A: Translate the hex code to its colour name (e.g., “Black”) using the Colour List. If unknown, write “#000000 (Unconfirmed)”.

Q6: When should I flag a record for manual review? A: When any of the three attributes are missing or invalid after the validation checks.

Q7: The product has a “size” that is a dimension (e.g., “12‑inch”) A: Record the dimension exactly (e.g., “12‑inch”) under the Size column.

Appendix B – Glossary

  • Brand – The manufacturer or label associated with the product (e.g., “Nike”).
  • Size – Numerical or alphanumeric representation of the product’s dimension, measurement, or standard size code.
  • Colour (or Colour) – The visual colour of the product (e.g., “Black”, “Red”).
  • Product Description – Textual copy describing the product’s features and specifications.
  • Specification Sheet – A PDF document that contains detailed technical information about the product.
  • Extraction Summary – A short sentence summarising the extracted attributes.

Appendix C – Reference Materials

C.1 – Brand List (E‑Commerce)

Brand
Adidas
Nike
Puma
Under Armour
Reebok
New Balance
Skechers
Asics
Vans
Converse
Timberland
Columbia
North Face
Patagonia
Levi’s
Calvin Klein
Gucci
Prada
Louis Vuitton
Chanel
Dior
... (add as needed)

C.2 – Colour List

Colour
Black
White
Red
Blue
Green
Yellow
Orange
Purple
Pink
Brown
Gray
Navy
Beige
Maroon
Teal
Cyan
Magenta
Gold
Silver
Bronze
Navy Blue
Light Gray
Dark Gray
(Add any industry‑specific colours here, e.g., “Saffron”, “Olive”, “Burgundy”)

C.3 – Size Formats

FormatExample
Numeric“10”, “9.5”, “12”
US Size“US 10”, “US 9.5”
EU Size“EU 42”, “EU 44”
UK Size“UK 9”, “UK 9.5”
Letter Size“S”, “M”, “L”, “XL”, “XXL”
Measurement“12‑inch”, “30 cm”, “1.5 kg”
Combined“US 9.5 / UK 8”

C.4 – Style Guide for Outputs

  • Capitalisation: Capitalise the first letter of each word in the Brand and Colour fields; size values keep their original case (e.g., “XS”, “9.5”).
  • Punctuation: The Extraction Summary ends with a single period; no extra punctuation inside.
  • Lists: Separate multiple values with a comma followed by a space (e.g., “Black, White”).
  • Spacing: No extra spaces before or after commas.
  • Alphabetical Order: For multiple colours or sizes, preserve the order they appear in the source material; do not reorder alphabetically.

C.5 – Validation Checklist (for quick reference)

  • Brand – Present? (Yes/No) – If No → “Not Found”.
  • Size – Valid format? (Yes/No) – If No → “Not Specified”.
  • Colour – In colour list? (Yes/No) – If No → “Unconfirmed”.
  • All Three Present? – Yes → Proceed; No → Flag for manual review.

C.6 – Example of Multiple Colours

Colour CodeDescription
“Black/White”Two colours listed with a slash – record as “Black, White”.
“Black & White”Two colours separated by “&” – record as “Black, White”.
“Black/White/Red”Three colours – record as “Black, White, Red”.

Tip: When in doubt, use the first colour mentioned as the primary colour and list any additional colours after it.


Additional Notes

  • Always keep the original source wording in mind when extracting values; do not re‑interpret or re‑write unless the attribute format requires it (e.g., converting “12‑inches” to “12‑inch”).
  • If the product name contains a brand and the description also mentions the brand, confirm they match. If they differ, note the discrepancy in a comment column (optional).
  • Maintain a log of all records that are flagged for manual review to ensure follow‑up.

We build it

Extract Attributes

Extracts brand, size, and colour attributes from a product's name, description, and optional specification sheet PDF.

Product Information

Enter product details for attribute extraction.

Try me

The Pain of Manual Attribute Entry

  • Inconsistent wording leads to broken filters and frustrated shoppers.
  • Typing each attribute by hand eats into time that could be spent on strategy.
  • Human error introduces mismatches between product pages and inventory systems.

Turning Text Into Structured Data

Logic’s Attribute Extraction workflow lets a large language model read product names, descriptions, and optional spec‑sheet PDFs, then output a concise table of brand, size, and colour. The process respects industry‑standard lists, handles multiple values, and flags any uncertainty for quick review.

What You Gain

Uniform attribute data that powers reliable search and recommendation engines.
Faster catalog updates that let analysts focus on higher‑value analysis.
Built‑in validation that catches missing or conflicting information before it reaches the storefront.

Key Insight

Consistent attribute extraction is the quiet engine behind seamless shopper experiences – it reduces friction at every touchpoint from search to checkout.

A Snapshot of Benefits

BenefitWhy It Matters
ConsistencyGuarantees that filters and facets work as shoppers expect.
SpeedCuts the time to publish a new product from hours to minutes.
AccuracyMinimises costly re‑work caused by data mismatches.

Why This Workflow Stands Out

  • Robust validation – Each extraction undergoes checks for brand presence, size format, and colour validity, with clear flags for manual review.
  • Edge‑case handling – Multiple colours, size ranges, unknown brands, and hex colour codes are all interpreted according to a defined rule set, so no detail is lost.
  • Seamless integration – The workflow fits into existing Logic pipelines, allowing teams to trigger extraction whenever a product is added or updated without additional coding.

By embedding this workflow into your catalog process, your team can shift from repetitive data entry to strategic insight, keeping the product catalog fresh, searchable, and ready for shoppers.

Ready to Automate?

Get started with this workflow template in minutes. No complex setup required.

View Documentation