Data Center Resources
Other Resources
RUPHI
Center for Research on Health Care - Data Center
Suite 200,
200 Meyran Ave, Pittsburgh, PA 15213-3221
Phone: 412-692-4873
email contact:
dcweb@pitt.edu

Data Center - About Us

Overview

The Center for Research on Health Care Data Center (DC) provides state of the art data management and analysis services to the University of Pittsburgh's clinical and translational researchers. The DC's mission is to provide researchers with consistent, high quality information technology, data management, and statistical services. The DC operates as a team, providing expertise in all phases of research, thus ensuring efficient use of resources. The DC is committed to quality assurance and research integrity. With extensive experience, the DC is able to provide research faculty with experts in data management, data entry, programming, and statistical analyses.

DC Structure

The DC consists of 30 faculty and staff. The DC Director (Dr. Doris Rubio) is the primary point of contact for the Principal Investigators (PIs). Dr. Rubio is responsible for:

  • Ensuring high standards for research quality on every project
  • Acting as a liaison between the PIs and the DC team
  • Assisting PIs with budget planning to ensure sufficient budget for data needs
  • Ensuring that data analysis and data management are completed efficiently and within budget

The DC is comprised of three primary units: Information Technology Core, Data Core, and Biostatistics Core. Each unit is described below.

  • The Information Technology Core is responsible for designing internet applications, fulfilling programming needs, evaluating new software, and systems administration.
  • The Data Core provides expertise in developing databases, designing tracking programs, and constructing data entry and verification procedures.
  • The Biostatistics Core is comprised of Ph.D., Masters' and Bachelors' degree level statisticians. They are responsible for ensuring that each study has the appropriate design, sample size, and statistical analyses. They are involved in all phases of the study, from pre-award (conducting power analyses and consulting on the methodology) to post-award (running statistical applications and consulting with PIs to interpret the findings).

DC Experience

Over the last 5 years, the DC has worked on over 300 research projects. We are the data team for the Pittsburgh Pepper Center (funded through NIA P30 grant), the Multidisciplinary Clinical Research Scholars Program (funded through NIH's Clinical and Translational Science Award, CTSA), and the Consortium of Radiologic Imaging Studies of Polycystic Kidney Disease (CRISP) (funded through a cooperative agreement with NIDDK). Dr. Doris Rubio serves as the Core Director for the Design, Biostatistics, and Clinical Research Ethics for the Clinical and Translational Science Institute (funded through NIH's CTSA). We have provided data management and statistical support for numerous grants such as R01, R21, R34, and K awards. Projects on which we have worked range from multisite randomized clinical trials to pilot studies. We specialize in the development of paperless data collection systems using both the internet and tablet PCs.

DC Systems for Data Management

The System for Data Management (SYSDM) software systems we create are designed to support investigators seeking customized solutions to data collection, tracking, follow up, and reporting and analysis needs. SYSDMs are developed based on detailed study protocols and requirements. We work closely with investigators and other study personnel to ensure that protocols are being followed, data integrity and confidentiality are maintained, and that resulting data contains a minimum of incorrect and missing data. SYSDMs facilitate the generation of clean datasets by guiding interviewers through the data collection process in such a way as to display only questions and screens that are appropriate for the particular subject, eliminating the possibility of most incorrect entries and preventing the necessity of extensive recoding and cleaning by the statistician. SYSDMs also perform important tracking duties by monitoring such parameters as eligibility status, follow up interviews due, study group, etc. to ensure that the required data collection instruments are administered within the time constraints dictated by the study protocol.

Some of the important features of SYSDMs are listed below:

  • Web-based and/or intranet-based
  • Paperless or paper-based data collection
  • Tracking for recruitment, screening, and follow up
  • Automated randomization for randomized clinical trials
  • Range, logical, and other data validation checks
  • Data dictionaries and codebook
  • Customized report generation
  • Training for study teams
  • Generation of analysis datasets
  • Backup and archiving of study data

DC Resources

In addition to personal computers, the DC team has 12 servers, which are maintained by an in-house Systems Administrator. The servers include a data warehousing and file sharing server, a dedicated production web server, a development server, an intranet server, and a multi-core, multi-processor 64-bit server with high-speed RAID arrays for advanced statistical modeling. With our range of specialized servers, we offer both intranet and web-based data management systems, interactive web applications, and secure SSL-based data transfers. For intranet applications, Microsoft Access and SQL Server serve as the relational database management software used to program research databases; for web-based applications, we specialize in .NET, Sharepoint, Flash, and SQL Server as well as many other emerging web technologies.

In adherence to strict security standards, servers are secured behind a firewall, with access to study data restricted to authorized personnel only. All servers use hardware fault tolerance methods to assure the continued availability of data. Backups are run daily and archived weekly. Weekly archived media are stored at an off site secure location.