Magic Patterns Logo
Converting any website to a React component10 minutesTeddy Ni


I recently built a Chrome Extension that converts any snippet of a website into an isolated React component.


It was one of the most difficult things I've built, but I also think it's pretty cool. So, I thought I'd share how it works.


The Problem

Converting HTML to valid React was the first step. That was straightforward once I discovered window.getComputedStyle because I could extract all the CSS properties of an element.

This was the very first version of the algorithm:

  1. Iterate through each node in the DOM tree.
  2. For each node, get the value of every single CSS property via window.getComputedStyle.
  3. Do some parsing to convert the HTML into JSX.

While this works, the generated code is unusable because all elements have literally every CSS property explicitly set. No human would ever write code like this:

Example of code with all properties set

Note: If you know someone who writes React like this, I have something to tell you.


I quickly realized that getting useful React code while preserving the original styling is the primary challenge here.


First Attempt using AI

It's 2024, so naturally my first attempt was to feed the extracted HTML into a LLM. I prompted a LLM to clean up the HTML, remove reundant styles, and "prepare the component for production use."

While this worked ok for simple components, there were major problems:

  • Limited context length. Passing in the extracted styles as context for more complex components was simply too much and unreliable.
  • Variability. Sometimes you would get a good result, but other times you would get a bad result. An LLM was not good at consistently removing redundant styles.
  • Cost. Each conversion costs money (albeit not much). But this made me want to explore other deterministic options.

Second Attempt using getComputedStyle

My goal became reducing the amount of CSS properties needed to maintain the same visual look.

The breakthrough came when I realized that we needed the CSS properties that were being explicitly set on the element either via inline styles or via a stylesheet.

Then, only for the explicitly set CSS properties, I could extract and include the value.

My updated algorithm looked like this:

  1. Iterate through each node in the selected part of the DOM tree.
  2. For each element, find any matching CSS selectors or inline styles. Generate a list of the CSS properties that are explicitly set.
  3. Use getComputedStyle on the list of properties to get the values we need to set.
  4. Construct the JSX for the component using the above information.

BROWSER KNOWLEDGE

The browser has really powerful stylesheet APIs. You can manually create, edit, and query stylesheets in Javascript, which I didn't know until working on this.


On average, the output went from 200+ CSS properties to around 5-10 properties on each element. I was super excited because this meant:

  • Revisit AI. Because context length was reduced, AI performance improved greatly. Still not perfect, but a lot better. (I now let users decide if they want to feed the output into AI through a "convert" button.)
  • Readable React code. It doesn't look like a human wrote it quite yet, but it's way more readable than before.
  • Optimization opportunities. It was clear what else I could improve.

Further Optimizations

Now that we have the baseline styles, we can trim it down even more.

Optimization 1: Abstracting global styles

Often times, websites will have certain global styles. For example, resetting the box-sizing property is pretty popular:

div {
    box-sizing: border-box;
}

Obviously, I didn't want to include boxSizing: 'border-box' on every single div in the component. So, I wrote this function that looks for shared properties and abstracts them into a top-level style tag.


function extractExplicitStyles(
  element: Element,
  styleSheets: CSSStyleSheet[],
  pseudoElement?: string
): StyleObject {
  if (!(element instanceof HTMLElement || element instanceof SVGElement)) {
    return
  }
  const styleObject: StyleObject = {}
  
  const processRules = (rules) => {
    Array.from(rules).forEach((rule) => {
      if (rule instanceof CSSStyleRule) {
        ...
      }

On average, this helped reduce the lines of code in the component by ~5%.

Optimization 2: Removing inherited styles

Another optimization I made was removing styles that were already being inherited from parent elements.

For example, if you look at this HTML:

<div style={{ color: 'blue' }}>
  <span style={{ color: 'blue' }}> Hello World </span>
</div>

The span element doesn't need to explicitly set color: 'blue' since it will inherit that from its parent div. I wrote logic to detect these cases and remove redundant style properties.

This was super effective for properties like:

  • color
  • font-family
  • font-size
  • line-height

On average, this helped reduce the lines of code in the component by ~10%.

Optimization 3: Pulling out SVGs

This was an obvious one when you saw the output at this stage. SVGs turn to contribute a lot of noise and bloat the component size. A single SVG is like 200 lines of mumbo jumbo code. And they appeared more often than I originally expected, primarily because of icons.

I now pull the SVGs into their own components and imported them as to make the core component smaller. (Later on, this proved to be extra useful because users tend to have their own icon set regardless.)

In most cases, this helped reduce the lines of code in the component by ~20%.

Optimization 4: Condensing styles to their shorthand properties

In CSS, there's a shorthand property for several styles. For example, padding: 10px 50px 20px; is the same as:

  • padding-top: 10px
  • padding-right: 50px
  • padding-left: 50px
  • padding-bottom: 20px

So, I wrote a function that condenses the styles for border, padding, and margin. It's gnarly because the logic differs depending on how many values are specified. In the case of padding, when three values are specified, the first padding applies to the top, the second to the right and left, the third to the bottom. But when one value is specified, it applies the same padding to all four sides:


if ([second, third, fourth].every((v) => v === first) && first) {
      cleanedStyles[set[4]] = first
      delete cleanedStyles[set[0]]
      delete cleanedStyles[set[1]]
      delete cleanedStyles[set[2]]
      delete cleanedStyles[set[3]]
}

What's next?

I'll be honest there's still a few bugs. When the entire internet is your input set, there's a lot of edge cases. One particular bug that I haven't cracked yet is dealing with images that referenced by their relative path.

Here's some real examples if you're curious what the final output looks like:

I've been thinking about open-sourcing the core library, but the code is pretty messy. If you're interested in this, let me know.

Here's the extension if you want to give it a try.

Thanks for reading! This was fun to hack on - Teddy