Data Center: About Us
Overview
The Center for Research on Health Care (CRHC) Data Center provides state-of-the-art data management and analysis services to the University of Pittsburgh's clinical and translational research scientists. The Data Center is committed to quality assurance and research integrity. Its mission is to provide investigators with experts in data management, data entry, programming, statistical analyses, web development, and graphic design. The experts operate as a team, providing help in all phases of research and thereby ensuring efficient use of resources.
Data Center Structure
The Data Center consists of 30 faculty and staff members. The director, Doris M. Rubio, PhD, is the primary point of contact for principal investigators (PIs). Dr. Rubio is responsible for the following:
- Meeting high standards for research quality on every project.
- Acting as a liaison between the PIs and the Data Center team.
- Assisting the PIs with budget planning to make sure that funds are sufficient for data needs.
- Ensuring that data analysis and data management are completed efficiently and within budget.
The Data Center has four primary cores:
- The Information Technology Core is responsible for designing Internet applications, fulfilling programming needs, and evaluating new software. It is also responsible for systems administration.
- The Data Core provides expertise in developing databases, designing tracking programs, and constructing data entry and verification procedures.
- The Biostatistics Core includes statisticians with training at the PhD, master's, and bachelor's degree levels. The statisticians are responsible for ensuring that each study has the appropriate design, sample size, and statistical analyses. They are involved in all phases of the study, from preaward services (conducting power analyses and consulting on the methodology) to postaward services (running statistical applications and consulting with the PIs to interpret the findings).
- The Graphics Design Core creates and maintains Web sites. It also develops brochures, booklets, reports, flyers, logos, and figures for investigators and departments.
Data Center Experience
Over the past 7 years, the Data Center has worked on over 500 research projects. The Data Center has provided data management and statistical support for numerous grants, such as R01, R21, R34, P30, U01, and K awards. It has worked on projects ranging from pilot studies to multisite randomized clinical trials. It specializes in the development of paperless data collection systems, using both the Internet and tablet PCs.
The Data Center serves as the data team for the following: the Pittsburgh Pepper Center, which is funded through a P30 grant from the National Institute on Aging (NIA); the Multidisciplinary Clinical Research Scholars Program, which is funded through the National Institutes of Health (NIH) Clinical and Translational Science Award (CTSA); and the Consortium of Radiologic Imaging Studies of Polycystic Kidney Disease (CRISP) and the HALT Progression of Polycystic Kidney Disease (HALT-PKD) Study, both of which are funded through a cooperative agreement with the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). Doris M. Rubio, PhD, serves as director of the Data Center. She also serves as director of the Design, Biostatistics, and Clinical Research Ethics (DBE) Core of the Clinical and Translational Science Institute (CTSI), which is funded through the CTSA.
Data Center Systems for Data Management
The Data Center creates System for Data Management (eSYSDM) software systems to support investigators who are seeking customized solutions to their data collection, tracking, follow-up, reporting, and analysis needs. The eSYSDMs are based on detailed study protocols and requirements. The Data Center team works closely with investigators and other study personnel to ensure that protocols are being followed, that data integrity and confidentiality are maintained, and that the amount of incorrect or missing data is minimal. eSYSDMs facilitate the generation of clean datasets by guiding interviewers through the data collection process in a manner that displays only the questions and screens that are appropriate for a particular subject. This minimizes the possibility that incorrect entries will be made and prevents the need for extensive data recoding and cleaning by the statistician. eSYSDMs also perform important tracking duties by monitoring various parameters (such as eligibility status, follow-up interview due dates, and study groups) to ensure that the required data collection instruments are administered within the time constraints dictated by the study protocol.
The following are some examples of the features of eSYSDMs:
- Web-based features, intranet-based features, or both.
- Paperless or paper-based data collection.
- Tracking for recruitment, screening, and follow-up.
- Automated randomization for randomized clinical trials.
- Range, logical, and other data validation checks.
- Data dictionaries and codebooks.
- Customized report generation.
- Training for study teams.
- Generation of analysis datasets.
- Backup and archiving of study data.
Data Center Resources
In addition to high-end personal computers, the Data Center team has over 25 servers that are maintained by an in-house systems administrator. The servers include a data warehousing and file-sharing server, a dedicated production Web server, a development server, an intranet server, and a multicore multiprocessor 64-bit server with high-speed RAID arrays for advanced statistical modeling. With this range of specialized servers, the Data Center offers both intranet and Web-based data management systems, interactive Web applications, and secure SSL-based data transfers. For intranet applications, SQL Server is the relational database management software used to program research databases. For Web-based applications, the Data Center specializes in .NET, SharePoint, Flash, and SQL Server and has many other emerging Web technologies.
To adhere to strict security standards, servers are protected behind a hardware firewall, and access to study data is restricted to authorized personnel. Servers that house sensitive data are encrypted per University policy and are scanned on a monthly basis to ensure that there are no server compromises and that the servers have the latest security standards in place. All servers use hardware fault tolerance methods to guarantee the continued availability of data. Servers are physically located in a secure data center controlled by key card entry, double lock door entry system, a motion detection system, and a temperature control system. Backups are run daily and archived weekly. The weekly archived media are stored at a secure off-site location 2 miles from the Data Center.
|