Home

I am a se­nior re­search sci­en­tist study­ing com­puter ar­chi­tec­ture at NVIDIA in Austin, TX. I con­duct re­search in the de­sign of ef­fi­cient, de­pend­able, and se­cure com­puter sys­tems.

A photo from 2023

Current Research Interests

Low-Cost Security

As tech­nol­ogy ad­vances, en­sur­ing the se­cu­rity of com­puter sys­tems and net­works is be­com­ing es­sen­tial for pre­vent­ing cyber at­tacks, pro­tect­ing con­fi­den­tial data, and main­tain­ing the trust of in­di­vid­u­als and busi­nesses. A large part of cur­rent re­search fo­cuses on en­sur­ing se­cu­rity with­out costly over­heads or pro­hib­i­tive de­sign com­plex­ity. A key goal of my re­cent pub­li­ca­tions is mem­ory safety, which in­volves pre­vent­ing mem­ory-re­lated vul­ner­a­bil­i­ties such as buffer over­flows and dan­gling pointer deref­er­ences that can lead to se­ri­ous se­cu­rity breaches.

Strong Memory System Reliability

The size and sen­si­tiv­ity of com­puter mem­ory make its pro­tec­tion the first order of busi­ness for a re­li­a­bil­ity-con­scious de­signer. De­spite the long and suc­cess­ful his­tory of using error cod­ing tech­niques to mit­i­gate mem­ory error rates, there is still a need for strong and flex­i­ble mem­ory error pro­tec­tion tech­niques for large su­per­com­put­ers and other high-per­for­mance sys­tems. Cor­re­spond­ingly, much of my re­cent re­search fo­cuses on tech­niques to pro­vide very high lev­els of main mem­ory re­li­a­bil­ity with­out ex­ceed­ing the cur­rent in­dus­try stan­dard stor­age foot­print for er­ror-cor­rect­ing codes.

Efficient and Reliable Application-Specific Acceleration

In­creas­ing lev­els of in­te­gra­tion make it so that spe­cial­ized hard­ware units can be cost-ef­fec­tively placed on-chip. This, com­bined with the ever-in­creas­ing need for en­ergy ef­fi­cient ex­e­cu­tion, will make the hard­ware ac­cel­er­a­tion of im­por­tant ap­pli­ca­tions and work­loads more com­mon­place. To­wards this end, some of my re­search has aimed at the ef­fi­cient ac­cel­er­a­tion of work­loads that ex­hibit fine-grained gather/scat­ter mem­ory ac­cess pat­terns, DRAM link com­pres­sion, and the re­li­a­bil­ity char­ac­ter­i­za­tion of DNN ac­cel­er­a­tors.

Other Interests

System-Level Reliability

Re­li­a­bil­ity and re­silience are major ob­sta­cles on the road to ex­as­cale com­put­ing and be­yond. The num­ber of com­po­nents re­quired for ex­as­cale sys­tems and the de­creas­ing in­her­ent re­li­a­bil­ity of com­po­nents in fu­ture fab­ri­ca­tion tech­nolo­gies con­spire to make re­li­a­bil­ity a first-or­der de­sign con­cern. A strong sys­tem-level ap­proach to­wards re­li­a­bil­ity is needed in order to ef­fi­ciently han­dle er­rors at all scales. In ad­di­tion, it is im­por­tant to en­able and ex­ploit cross-layer re­li­a­bil­ity through sys­tem-level mech­a­nisms—there are a plethora of dif­fer­ent fail­ures that can occur in a large com­puter sys­tem, and su­pe­rior ef­fi­ciency can only be achieved by deal­ing with every error in the ap­pro­pri­ate man­ner and sys­tem layer.

Arithmetic Error Detection

Ris­ing lev­els of in­te­gra­tion and de­creas­ing com­po­nent re­li­a­bil­i­ties make error pro­tec­tion in­creas­ingly im­por­tant. At the same time, the need for en­ergy ef­fi­ciency ne­ces­si­tates the care­ful eval­u­a­tion of re­silience tech­niques. Arith­metic error pro­tec­tion is typ­i­cally more ex­pen­sive than the pro­tec­tion of mem­ory or data move­ment, re­quir­ing large amounts of re­dun­dant logic. Pro­tec­tion of com­puter arith­metic has cor­re­spond­ingly been re­served for crit­i­cal or high-avail­abil­ity ap­pli­ca­tions. Given cur­rent re­li­a­bil­ity trends, some of my re­search is fo­cused on pro­vid­ing strong, flex­i­ble, and low-cost error pro­tec­tion for arith­metic op­er­a­tions.