Skin Color Protocols

Ensuring diversity of skin pigments in a medical device study has become recognized as increasingly important yet challenging.

What is skin color?

There are many terms related to skin color that are often used interchangeably, but in fact may have different meanings. Here are a few key definitions from the terminology section of this toolkit:

Skin color

Human skin color is the result of how light interacts with the skin surface and deeper tissues. It is impacted by multiple chromophores or molecules that absorb light at specific wavelengths and thus effectively reflect/emit color. The main chromophores impacting human skin color include oxyhaemoglobin (red), deoxygenated hemoglobin (dark red/brown), carotenoids (a yellow-orange exogenous pigment), bilirubin (yellow), biliverdin (green) and melanin (brown). The epidermis contains melanin but little hemoglobin while the dermis contains little melanin but significant vascularity (e.g. hemoglobin). Color and pigmentation can be impacted by numerous environmental, pharmacologic, and medical factors. (See also ‘race’ and ‘ethnicity’)

Skin pigment

Human skin pigments include two forms of melanin (eumelanin, and pheomelanin), which are produced and contained in the epidermal or outer layer of the skin. Pheomelanin is associated with light red or yellow colors, while eumelanin is associated with dark brown or black colors. The amount of melanin in the skin impacts color (i.e. more melanin appears darker). Color and pigmentation can be impacted by numerous environmental, pharmacologic, and medical factors. Melanin and hemoglobin strongly absorb visible and ultraviolet light, and show less absorption at near-infrared wavelengths.

When to measure skin color?

Skin color is a factor associated with many things, ranging from how optically-based technologies work (e.g. pulse oximeters, self-driving cars, wearable health technologies) to a person’s access to education, earning, employment, and health. In some cases this association is causal and in other cases it is not. While much attention is given to race, ethnicity and racism, relatively little attention is given to color and colorism. Skin color should not be conflated with race or ethnicity (Important note: race is not a biological or genetic characteristic). Although some studies may collect data on race or ethnicity as a proxy for skin color, it is critical to first question and understand the purpose of data being collected. In some cases, such as studies related to optically based technologies, it is more precise to measure skin color and not just simply collect race or ethnicity to be used as a surrogate for light or dark skin color.  As a researcher, you should articulate the purpose of each of the study variables being collected in the study design, and study publications. Be sure to state if you are using the variable as a cause or as a marker of some social exposure. If the latter, you should consider addressing the exposure more directly. For example:

 

  • Good: Does skin pigment interact with red or infrared light of pulse oximeters resulting in performance bias?
  • Good: Are certain conditions that have been associated with race, such as lung disease, vascular disease, environmental exposure, or smoking also linked to performance bias in pulse oximeters?
  • Not Good: Do biological differences between black and white people result in worse performance of pulse oximeters?

If you are unsure which variables are best suited to help answer your research questions or how to categorize variables, engage with broad stakeholders. It is also important to have a diverse study team. Consider inviting researchers and members from the communities you are studying to be involved in the project starting from the planning phase. Different perspectives and community insights can help refine and enhance the objectives and outcomes.

When comparing across studies or datasets, especially older data, be wary of comparing racial categories, which may have changed over time or are inherently different across settings. If the legacy data are no longer relevant to the study questions you are investigating, consider collecting new data that align with the current context. As researchers, this allows us to be responsive and adaptable to new information.

If using race in addition to skin color, consider asking how your study cohort has been racialized (i.e., assigning people to a racial group and utilizing that assignment as the basis for differential treatment). Race is a social construct and not applied in a standardized manner across settings.

  • What are examples when you should consider measuring/assessing skin color?
    • Examples when this was not done are had negative consequence
  • What are examples when you should NOT use skin color/pigment?

 

How can skin color be characterized?

There are several commonly used methods for characterizing skin color and skin pigment, however, there is no consensus on a standard or validated approach. Not all techniques have been used for the purposes of validating diversity of skin color in cohorts for medical device research and development. Below we discuss some of these methods including evolving evidence and regulatory guidance.

Before selecting a method, it is important to clarify the reason why you want to measure this variable, and specifically is the study interested in color, pigment, or other variables (e.g. colors of deeper structures). It is also worth noting that skin color is usually only a reflection of the outermost layers of skin, and some devices, like pulse oximeters, may be impacted by color (light absorbance/reflection) of tissues that are deeper.

Protocols

Quantification of skin color for research or device development is a topic with limited consensus and evolving data. There are many methods and many caveats to each. The OpenOximetry.org Project is trying to keep track and pilot several methods in laboratory and clinical trials. Our latest protocols are online and open access.

Download Protocols

Subjective methods

Numerous methods have been used for subjectively characterizing skin color, though not all of these scales were designed or validated for this purpose. Many of these methods are listed in Table 1 below. Important characteristics to consider when selecting a method include:

  1. Standardization of the scale
  2. Usability
  3. Repeatability/reproducibility (printing)
  4. Potential bias – Potential bias – raters may be biased by their personal background or experience with assessing color. Some observers consistently rank subjects as lighter or darker on skin color scales. This can be mitigated by having 3 or more observers assess using the subjective method (Verkruysse et al 2024).
  5. Validation
  6. Intention

Objective methods

Numerous methods have been used for objectively characterizing skin color, though limited consensus exists on how to optimally utilize these methods. Also of note, there is considerable heterogeneity in how these devices/methods work at a technical level as well as what data they report. Many of these methods are listed in Table 2 below. Important characteristics to consider when selecting an objective method include:

  1. Standardization and transparency of reported data: should be observer independent, quantitative. Several studies are ongoing to compare outputs from different devices.
  2. Repeatability – study subject variables may impact repeatability such as age, sex, anatomic site, skin surface properties as well as temporal factors such as sun exposure, orthostatic effect and external variables such as lighting and temperature, and application of the technology like pressure applied
  3. Calibration
  4. Usability – small areas vs larger, pressure
  5. Validation
  6. Intention – interaction with surface pigment or color, vs interaction with deeper structures (e.g. blood, tissue, bone, fat)
  7. Interpreting data generated by objective measures Some but not all metrics used by objective methods are standardized. Below is a summary of the most commonly reported data:
    1. RGB color space – this number is used to characterize color by the intensity of values of red, green, and blue color
    2. CIELAB color space – defined by Commission Internationale de l’Eclairage (CIE) is a three-dimensional color space with three axes defined further below. The L* and b* have been correlated with pigment and the a* correlated to erythema levels. The purpose of CIELAB is to be more uniform than RGB space. The rough idea is that distance in LAB space between colors provides a quantification of how perceptibly different the colors are. The distance in RGB space is thought to not be so linear, so for example, if the distance doubles the colors may not actually become twice as easy to tell apart.
      1. L* = lightness 0 (black) to 100 (white)
      2. a* = red/green on the chroma plane
      3. b* = yellow/blue on the chroma plane
    3. Individual Typology Angle (ITA) – this metric is used to classify skin pigmentation, specifically melanin, using L* and b* according to the formula below. ITA = [arctan((L*-50)/b*)] *180/π.  Of note, many publications have circulated a slightly altered formula for calculating ITA. This formula results in minor miscalculations of ITA for medium-pigmented skin, but it causes large discrepancies in ITA values for very dark and very light-pigmented skin (Verkruysse and Jaffe 2024). Ensure that you are using the correct formula before performing ITA calculations. Del Bino et al showed that ITA has a linear relationship with tissue melanin content (originally using ex-vivo samples). The ITA was introduced in 1991 (Chardon et al 1991) with 6 proposed categories of skin color (very light > 55° > light > 41° > intermediate > 28° > tan >10° > brown). However, the original study bins were based on data from only caucasian subjects and only non-sun-exposed  (intrinsically lighter) skin. A subsequent study by Del Bino et al. in 2013 did include a more diverse population (as well as measurements from sun-exposed i.e. intrinsically darker sites),  and introduced an additional ITA cut-off of <-30° for darker skin37. It is important to note, that this expanded and most current categorization of skin color by ITA may still not accurately reflect and optimized categorization of global skin colors (Leeb et al, eBiomed, 2023). Of note, ITA is not skin color. A single ITA value may be associated with many different perceived skin colors.  and does not account at all for the red/green color scale. ITA has been the most commonly evaluated and validated metric for melanin.
    4. Melanin index (MI) – this metric is intended to quantify melanin as a single number, sometimes by taking log10 of the reflectance ratio. Not all device manufacturers provide detail on the derivation of MI and thus MI may not be comparable across devices, subjects and studies.
    5. Melanin density – this metric has been reported by subtracting reflectance at 400nm from 420nm (R420-R400), in order to reduce confounding by hemoglobin which has similar absorption at both wavelengths
    6. Erythema Index (EI) – non standardized metric for quantifying amount of erythema present in skin
    7. Illuminant & observer angle – illuminant and observer angle help convert a spectrum to a set of LAB and RGB values
    8. Other proposed strategies
      –  R390 – this is the isosbestic point for oxy and deoxyhemoglobin, only penetrates at shallow depths and thus has been used to characterize epidermal melanin
      –  AU intensity curve 450-615
      –  Slope of absorbance spectrum (620-720nm)

Assessment Strategy & Best Practices

Both subjective and objective assessment methods can be applied in many different ways that impact the reliability of data.

  • It matters if assessment methods are being applied to printed photos, digital photos or real-life people.
  • Factors like lighting, printer color calibration, and monitor color calibration become significant factors.
    • Inconsistencies between printers can cause variations in the color values of a printed scale, this can be at least partially avoided by using specific printing instructions for each print of a scale (Verkruysse et al 2024). Specific printing guidelines for the Monk Skin Tone Scale are coming soon.
  • Some assessment methods were not designed or validated to be used in multiple ways, so be sure to check how your method was intended to be used.

 

Additional References

Objective skin color measurement methods

Method Name
Classification
Mechanism
Illumination/aperture
Measurements
Reliability/calibration
Intended Use
Objective
Spectrophotometer
Light source: pulsed xenon lamp with UV cut filter Illuminated area: 6mm vs 11mm depending on aperture Angle: 8* viewing angle, 2* or 10* observer angle Wavelength range: 400-700nm
Eligible ISO 17025 certification L*, a*, b* L*, c*, h* Yxy XYZ Munsell (Hue, Value, Chroma) Color differences
Zero calibration & White calibration functions Spectral reflectance: SD <0.1% Chromaticity value: SD <∆E*ab 0.04 Inter-instrument agreement: <∆E*ab 0.2 (for 8mm aperture) Short-term repeatability: higher with continuous measurements and larger aperture size
Color measurement for a variety of applications. Can be used for dermatological purposes using the CM-SA Skin Analysis software
Objective
Colorimeter
Light Source: 6 LEDs Illuminated area: 6mm
L*, a*, b* L*, c*, h* RGB CMYK KRV LCH
Datacolor ColorReader mobile application is required to alert when a white calibration is needed. App will prompt when a calibration is required using the calibration tile that comes with the device.
Paint color matches for most paint brands
Objective
Colorimeter
Light Source: LED Illuminated area: 8mm Wavelength Range: 400~700nm
Reflectance, CIE-Lab, CIE-LCh, Hunter Lab, CIE-Luv, XYZ, Yxy, RGB, Color Difference (ΔE*ab, ΔE*cmc, ΔE*94, ΔE*00), Whiteness Index (ASTM E313-00, ASTM E313-73, CIE/ISO, AATCC, Hunter, Taube Berger Stensby), Yellowness Index (ASTM D1925, ASTM E313-00, ASTM E313-73), Blackness Index (My, dM), Staining Fastness, Color Fastness, Tint (ASTM E313-00), Color Density CMYK (A,T,E,M), Metamerism Index Milm, Munsell, Opacity, Color Strength
Autocalibration
Designed to measure color value, color difference value and to find similar color from color cards for printing industry, paint industry, textile industry, etc
Objective
Spectrocolorimeter
Light Source: Independent tri-directional 25 LED Illuminated area: 4 & 8mm
L*, a*, b*, C*, h
Autocalibration
Designed to provide stable color comparisons for materials and products
Objective
Colorimeter
Light Source: D50, D65, F11 Measure Aperture: 4mm
L*, a* ,b
Autocalibration
Designed to provide color analysis for textile, printing and dyeing, garments, shoes, leather, chemical, plastic, pigment, paint, ink, printing, metal, photography, toys etc
Objective
Narrow-band reflectance colorimeter
Wavelength Range: red: 660nm, infrared: 880 nm (melanin); green: 568 nm, red: 660 nm (erythema)
Melanin index and erythema index (arbitrary units 0-999)
Designed to measure skin melanin and hemoglobin (erythema) by reflectance
Objective
Full visible spectrum reflectance colorimeter
Light source: 8 LEDs arranged circularly Illuminated area: ~17mm Wavelength range: 440-670nm
skin color RGB L*a*b XYZ ITA
Calibration check function Measurement error: +/- 5% Interobserver reliability: ICC 0.79-0.97 moderate/good (Van der Wal et al 2013)
Designed to measure the color of skin and hair
Objective
Full visible spectrum reflectance colorimeter
Light source: 3 white LEDs arranged circularly Illuminated area: 0.3cm^2 Angle: 45* to minimize gloss RGB range: 25-246 Peak wavelengths: 620/540/460nm
skin color L*a*b ITA L*c*h RGB melanin and erythema indices
Calibration check function Not affected by ambient light due to direct skin contact Optical orifice designed to minimize pressure induced skin blanching
Skin color measurement for melanin, erythema and skin color

Subjective skin color measurement methods

Method Name
Classification
Measurements
Pros and Cons
Intended Use
Self-reported
16 distinct skin types
Classify skin type according to four different components: dry or oily, sensitive or resistant, pigmented or nonpigmented and prone or tight
Self-reported
Six categories
Classify skin type into six categories by race and genetic origin: Nordics, Europeans, Mediterraneans, Indo-Pakistanis, Africans and Asians
Visual, self-reported
Six categories I- VI
Limited relevance and reliability among people with darker skin color which are clustered into 1 Fitzpatrick category.
Six-point subjective classification system to assess the propensity of the skin to burn during phototherapy
Visual
Four type scale
Used to assess photoaging (rhytides and discoloration) in White individuals
Goldman World Classification of Skin Types
Visual, self-reported
Used to assess skin color in response to burning or tanning, and PIH based on race/ethnicity
Kawada Skin Classification System for Japanese Individuals
Self-reported
Used to describe Japanese skin types and their sensitivity to UV light, sunburn, and tanning
Self-reported
Five different skin types
Used to account for five different skin types based on geography and heredity; can be used in conjunction with FST to assess for risk factors prior to treatments such as cosmetic laser surgery or chemical peels
Visual
Scale 1-11
The colors of the palette came from internet photographs and the palette was pre-tested for ease of use by interviewers and to see if it covered the range of colors found in the real world, with emphasis on Latin America.
Visual
Four category scale
Based on newborn’s chest skin and nipples/areolae compared with photographs of neonatal chests (two patients per color) taken at 24 h of life.
Visual, self-reported
Four elements (phototype, hyperpigmentation, photoaging, and scarring)
Used to evaluate phototype, hyperpigmentation, photoaging and scarring to identify a patient’s skin type and provide data to predict the skin’s likely response to insult, injury or inflammation
Visual
15 uniquely colored plastic cards with 10 bands of increasingly darker color per card
Used to measure pigmentation using 15 uniquely colored plastic cards spanning the full range of skin hues; each card contains 10 bands of increasingly darker gradation
Visual
36 colored tiles
Of the 36 tiles, 2 were found to have the same RGB values
Used to establish race classification by skin color using 36 opaque glass tiles which are compared to the compare the patient’s skin
Willis and Earles scale
Self-reported
Used to classify skin color, UV light reaction and associated pigmentary disorders in people of African descent
Self-reported, visual
Six color bars
Used to determine the amount of melanin pigment in various skin types and assess an individual’s potential to get sunburn
Visual
Chart classifying color based on hue, value and chroma
Originally intended to be used in agriculture to identify soil colors but has been adopted to describe human skin tones
Visual
11 point scale
Used to classify human skin color without respondent seeing the chart
Visual
10-tone scale
The scale encompasses a broader range of darker skin tones. One note is that the delta E (difference) between some shades on the scale is very small.
Used to classify a broader range of skin colors and developed to improve skin tone evaluation in machine learning and artificial intelligence
Visual
138 color scale
The number of colors is both a benefit and a challenge because some skin tones are very similar to each other and difficult to distinguish between.
The scale was formulated to be the closest physical representation of skin tone colors for visual reference in the beauty, fashion, photography, and product design industries.

Contributors to this page include:

This page is a collaborative effort from many individuals across many institutions including: Caroline Hughes, Ella Behnke, Fekir Negussie, Jenna Lester, Koyinsola Oyefeso, Leo Shmuylovich, Lily Ortiz, Michael Lipnick, Wim Verkruysse, the Open Oximetry team, and the Skin Color Quantification Subgroup.