WSU’s statistical geneticist makes sense of data

By Scott A. Yates

As recently as a decade ago, wheat breeders had a couple of hundred data points—that is, sets of measurements—to base their decisions on which preliminary lines to advance. It was difficult to make sense of all the numbers, but not impossible.

Zhiwu Zhang, Assistant Professor/Scientist

Zhiwu Zhang, Assistant Professor/Scientist, Washington State University

Fast forward to 2016 and thanks to ever-expanding technology, data points now number in the tens of thousands, far too many for individual breeders to evaluate. That is, of course, unless they have Zhiwu Zhang in their corner.

As Washington State University’s (WSU) statistical geneticist, Zhang is a man who recognized from a young age that numbers spoke to him.  Called ZZ by Westerners who have difficulty pronouncing his first name, Zhang doesn’t actually work with wheat plants. His development of new software tools and models, however, may turn out to advance variety releases faster than anything breeders can do by themselves in the greenhouse or the field.

Zhang, 55, arrived at WSU two years ago from Cornell University where he served as senior research associate in the Institute for Genomic Diversity. His family drove the 2,500 miles from Ithaca to Pullman, but the journey to his sunny office in Johnson Hall, began much earlier than that.

Born in a rural area of northeast China, Zhang’s connection to agriculture is more than scientific. His family were farmers who grew a variety of crops and raised pigs, ducks, chickens and geese.

“As the eldest of four, I helped my dad a lot. My earliest memories are of working on the farm. I remember being seven years old and flipping the sweet potato vines,” a practice required for older varieties in order to prevent multiple roots and small potatoes from developing.

Zhang is currently the recipient of the Washington Grain Commission’s $1.5 million distinguished endowment which guarantees him funding to apply to his research. Although Zhang’s expertise has multiple applications—his software is used in cattle genetics and cancer research, among other fields—his primary focus for now is on wheat.

In layman’s terms, Zhang is responsible for helping breeders reveal the genetic potential of wheat lines in WSU’s breeding programs well in advance of what could be deduced just a few year ago. He’s doing this by building computer models which analyze the tens of thousands of data points revealed by the DNA of each individual line. This lets breeders know much more quickly whether the characteristic they want to transfer are present. If it all works as intended, Zhang could, in one fell swoop, cut about two years off the breeding process while improving the favorable characteristic of the varieties released.

It’s a lot to ask of one individual, but then Zhang is not your ordinary scientist. He has not one, but two PhDs, the first in animal genetics from China’s Northeast Agriculture University. Following that degree, he worked for the Chinese Academy of Agricultural Science for three years before he got an opportunity to visit Michigan State University.

He had no plans at the time to study for a second PhD, but when a Michigan professor suggested he could obtain another degree in statistical genetics, he jumped at the chance. That’s because the discipline combines two of Zhang’s loves—math and genetics.  He couldn’t have been happier.

His wife, Wanling, is also well educated. She now works for the Agricultural Research Service of the U.S. Department of Agriculture on the Pullman campus as a seed curator. The pair have two children. James 28, works as a senior finance advisor for a medical group in Long Island. Joia ,15, is a high school student in Pullman.

To demonstrate Zhang’s love for and fascination with numbers, consider that before his daughter was born, he developed a computer program to choose her name. He did this by picking his 10 favorite letters in the English alphabet and telling the software to use his selection to come up with alternatives. Joia (Joy-a) was a name both he and his wife liked, but it turns out the French beat him to it without a computer. It means rejoicing.

Rich Koenig was chair of the Department of Crop and Soils in the College of Agriculture, Human and Natural Resource Sciences in 2011, when it became apparent the amount of available data on wheat lines was far surpassing breeders’ ability to use it. He pushed the administration to hire a scientist who could help breeders and geneticists make sense of the data.

“Once the Washington Grain Commission funded an expansion of Deven See’s genetic lab, the number of data points available as part of deciding whether a variety advanced went well beyond what a single human being could digest. We saw that a statistical geneticist was needed to write programs and code to take those thousands of data points and boil them down to what is the best set of lines the breeder can select to advance forward,” Koenig said. “Zhang is in a very high demand field now. We were lucky to hire him.”

Glen Squires, CEO of the WGC, said the board was receptive to Koenig’s entreaties, making the decision even before Zhang was hired to earmark distinguished chair funds for the position. The chair had previously been funding wireworm research which subsequently received funding through a line item in the budget.

“We’ve all heard about big data and how companies like Monsanto are using the bioinformatics to make farmers more productive. We believe Zhang’s work has the potential to do that for Washington farmers,” Squires said.

Mike Pumphrey, WSU’s spring wheat breeder, said Zhang fills an important gap between technology and analytics and said his own work has been significantly impacted, even though the statistical geneticist’s effort are just in the beginning stages.

“The goal of Zhang’s work is not to pick the absolute best. The goal is to be able to throw away the worst, so that what’s left has an opportunity to be truly better,” Pumphrey said. “That way, I will know before yield trials that everything is skewed towards being better, which means the lines I do plant have a chance of being truly better in the field.”

But it’s not just that. By being able to predict performance, Pumphrey said he can feel more confident that the subsequent crosses he makes will also be improved because of the “strong statistical evidence”. Speeding up the number of crosses he makes, meanwhile, has a kind of compound interest in terms of speeding up every other part of his program.

Pumphrey’s last released variety, Ryan, took nine years from initial cross to release. With Zhang’s help there’s the potential of cutting time to release to seven years.

“That is huge,” Pumphrey said.

Zhang, meanwhile, has acclimated well to his Pacific Northwest surroundings, even joining the poker game that certain university researchers and administrators play together. Word is that although Zhang had never played poker before joining the game, his expertise in statistical analysis is proving an advantage.

“He has proven a pretty formidable player,” Pumphrey said.