Data & Text

THE SKETCH 

This week’s project utilizes both data files and text to create a sketch that generates thousands of possible Chinese adoptee names. The link to the sketch on the web editor can be found here.

INSPIRATION & DESIGN 

What I love so much about IMA is its power to tell stories, particularly stories of marginalized identities. Being a Chinese adoptee, I had always found that there was enough material about Chinese adoption that was widely available, be it articles, research, books,  films, art, etc. Hence, my goal with my artwork is to always uplift Chinese adoptee voices and shed light on aspects of the industry that have so long been kept secret. For this project, I wanted to highlight a part of the adoptee experience that can be a very emotional, and in some cases controversial, topic: our names.

I discern now that the sketch does not represent any real data on adoptee names; rather, I have created imaginary data on imaginary orphans, including name and gender. The sketch is made to look as if it were an official government document, but the fact that none of the names are real sheds light on the fact that what is available inside and outside of China regarding the adoption industry is extremely limited. One may be able to find charts, graphs, and infographics that include some statistics about Chinese adoption, but it is extremely difficult to find raw data. The reasons behind this are quite complex and beyond the scope of this paper but are important to note.

Not only are there limited resources on adoptee names, but the system for naming real orphans is highly superficial and formulaic. As explained in more detail in the CODING section, the program works by generating 150 names based on randomly selecting characters from various data files. The seemingly arbitrary way of generating names is representative of the sad truth that orphans are not named out of emotional or personal reasons, but out of practicality and convenience.

SOME BACKGROUND ON CHINESE NAMES

Since the 1970s and into the early 2010s, orphanages all over china have had a uniform way of naming children. The first part was the surname, which in Chinese, is the first character of a 3 character name. The surname was if not always the first character of the city, district, or province they were found in. For example, I was born in Maoming (茂) city in Guangdong province, and my name is 茂欢贵. My adoptive brother was born in Wanzhou (万), Sichuan Province, and his surname in chinese is also 万. Since there are only so many cities in China in comparison to the thousand upon thousand different surnames in China, the surname for adoptees is the most generic, and it is often easy to tell who was an orphan once just based on surname.

The second part is the “first” name, which really consists of 2 characters. The first character of the two is only slightly less generic than the surname, often being shared by a handful of babies born in the same year, found in the same part of the city, etc. These characters are often commonly used characters such as 福 (fu), which means good luck, or 喜(xi), which means happy. In my case, all the girls at the orphanage born in 2002 share the second character of their name 欢 (huan).  The second character of the first name is usually the most unique of the three and has the most possibility for variance. Unlike normal Chinese names though, in which the final character is supposed to be representative of the child’s personality, the way they are selected for orphans is quite different. Sometimes orphanages might chose one common or “lucky” character such as 花(hua, flower) and utilize all characters with the same phonetic pronunciation hua. Sometimes they may want all babies of the same year to have charcters with similar meanings, and limit possible names with that. In other cases it may be randomly selected from a dictionary.

CODING

The “data” that was used to generate the names comes from 3 files. The first file is simply a .txt that lists all the cities in the province of Guangdong. I chose 1 province to condense the amount of data and to emphasize the commonality among adoptee names. From this file is where I created an array to hold all possible surnames, as shown below :

//DATA FOR FIRST CHARACTER (least personalized)
//get the first character of all cities in Guangdong province 
for (let i = 0; i < province_outfile.length; i++) { 
  first_char.push(province_outfile[i][0]); 
}

The second and third characters are pulled from a project I found on Github, which has data from the most common Chinese names from 1950 to 2000. The file top50char.year.csv contains the most popular characters used in personal names sorted by year and gender. I sort through each line of the file until reaching the desired indexes and push the values  into separate arrays for male and female middle names, named middle_char_m and middle_char_f respectively.

//DATA FOR MIDDLE CHARACTER (second-most personalized)
//get the top characters for male first names from 1950-2000
for (let i = 1; i < top_50_outfile.length; i++){     
  //read each row individually 
  current_row = split(top_50_outfile[i], ",");
  
  //search through each row and only push the elements which correspond to names into the middle_char_m array 
  for (el = 0; el < current_row.length; el++) {
    //indexes for male character
    if ((el >= 13) && (el <=18)){ 
      middle_char_m.push(current_row[el]); 
    }
    //indexes for female character 
    if ((el >= 25) && (el <=30)) { 
      middle_char_f.push(current_row[el]); 
    }
  }
}

Note that again, the data from this file contains the top names from Chinese citizens, not orphans, but because this dataset was on the smaller end (only about 300 characters total), I decided to use this to generate the second character.

The final characters were pulled from a file called givenname.csv, which contains popular names that correspond to all phonetic possibilities for a character. This file was not sorted by gender however, but by the number of times it was used in a male and female. By iterating over each line in the file and comparing the values, names are sorted into arrays last_char_f for female names, and last_char_m for male names.

//DATA FOR LAST CHARACTER (most personalized)
for (let i = 0; i < givenname_outfile.length; i++) { 
  //read each row 
  current_row = split(givenname_outfile[i], ","); 
  //determine whether it is male or female name based on     number of occurence 
  if (current_row[3] > current_row[4]) { 
    last_char_m.push(current_row[0]); //push to male
  }
  
  else { 
    last_char_f.push(current_row[0]); //push to female 
  }
}

All of the above arrays were created in the setup() function, while each name is generated in the draw() function.

Inside draw(), each unique name is generated using a two step process. The first step is to determine the gender of the name, which is decided by a function called male_or_female(), shown below:

function male_or_female() { 
  let num = random(); 
  if (num <=0.6) { 
    return 'f'; 
  }
  else { 
    return 'm'; 
  }
}

The condition if (num <=0.6) is used so that names will be female 60% of the time. This is based on the logic that 60% of orphans in China are baby girls.

The second step is to randomly pick surnames and personal names from the arrays built in setup. This is done using a simple for loop:

for (let i = 0; i < 155; i++ ){ 
  //decide gender of the name to be generated 
  gender = male_or_female();
  //randomly get a surname
  surname = first_char[int(random(first_char.length))];

  //if name is female, pull characters from f arrays
  if (gender == 'f') {
    middle = middle_char_f[int(random(middle_char_f.length))]; 
    last = last_char_f[int(random(last_char_f.length))]; 
    personal_name = middle+last; 
    gender= '(女)';  
  } 

  //otherwise pull characters from m arrays 
  else { 
    middle = middle_char_m[int(random(middle_char_m.length))]; 
    last = last_char_m[int(random(last_char_m.length))]; 
    personal_name = middle+last; 
    gender = '(男)'; 
  }
  let full_name = surname+personal_name
  results.push([full_name, gender]);

In the last line, we push the full name and gender into a results array, which is what is used to display the names on the canvas. Hence, each time the program is run, a new set of names is created.

REFLECTIONS

While the code of this project is not particularly challenging, it was challenging finding data to work with. Once I found material to work with though, I thoroughly enjoyed every step of this project. Reading through the names the program generated was a very self-reflective experience for me, it made me think about how my name simultaneously holds so much and so little meaning, and how any one of the names generated could be the name of a real-life adoptee somewhere in the world. While I understand that most people know little to nothing about Chinese adoption, let alone the naming system, my hopes are that this piece can open the conversation to all who are interested in thinking critically about adoption.

 

REFERENCES

https://github.com/psychbruce/ChineseNames

 

One thought on “Data & Text”

  1. Great work. I like the conceptual background and the presentation. You can start thinking about how to make it more of a “finished” presentation so the piece stands by itself. For example in this case you could add some static content to make it look more like a letter or give the viewer more of a clue what your intention is as an artist. What’s there is great, now you could start to frame it for even more impact!

Leave a Reply