DATA WAREHOUSES FOR TEXAS CHARTER SCHOOLS
Using Best Practices to Implement a Secure and Reliable Data Warehouse
Bringing your data together that insight can be gleaned via reports and visualizations is a key goal for charter IT teams
Getting security and privacy right when you build a data warehouse is a must
Creating and maintaining a data warehouse is a complex task, requiring an eye to the long term. A charter school that plans to add one needs to make safety and security paramount considerations. They’re equally important when reviewing or upgrading an existing installation.
Safety is a broad concept, not limited to protection from intruders and disasters. Data needs to be safe from obsolescence and loss of usability. Avoiding technological dead ends should be a major part of a school system’s long-term planning and maintenance policies.
Security is vital. Any system which is reachable from the Internet is a target. A data breach does serious harm to the school and the students.
Starting on the right foot
One of the first questions is where to put the data warehouse. The main alternatives are to use a cloud service, host it on-premises, or use a data center.
In some cases, data governance policies require keeping the data on the premises. Systems that the data warehouse will rely on may already be in place, and keeping everything on the same LAN may be the most practical approach.
Where there’s a choice, the cloud is usually more economical. It doesn’t require an up-front investment in equipment or hardware upgrades. Costs are more predictable and easier to budget. As requirements increase, cloud services scale up without much trouble.
Cloud systems offer better security than an on-premises system that doesn’t have top-quality IT management. The computers are physically protected, constantly monitored, and regularly updated.
Running an on-premises system requires some IT expertise. The staff is responsible for backups, software upgrades, and hardware maintenance. Using a data center reduces some of these requirements, but the IT staff is still responsible for keeping the systems running safely and reliably.
The only way to know which will be most economical is to conduct a cost analysis. It needs to take all factors into account, including personnel requirements and contingency planning.
The Role of Data Standards in a Data Warehouse
The design of the warehouse needs to work from the requirements for ingesting data, including types of data not yet planned. It also needs to consider long-term usability and flexibility. A design which limits the ability to add new data categories or is locked into one vendor’s software is risky.
Building the warehouse on widely used, comprehensive data standards greatly reduces the risk. For charter schools, the Ed-Fi standard is one of the best choices available. In fact, we’ve talked with data leaders at a large charter who say that Ed-Fi is the standard in the space right now. It provides a consistent way of representing data across multiple domains and applications. Many applications are available for handling compliant data.
Ed-Fi emphasizes the ability to visualize and analyze data. Many of its applications provide ways to measure student performance and identify trends.
Consistency is important. By its nature, a data warehouse brings in information from diverse sources. If different sets of data use different data standards, bringing all the information together for analysis is difficult.
Designing the warehouse
The design process requires a roadmap for what data to bring in and on what schedule. The users’ needs need to drive the roadmap. It may be tempting to throw in everything that’s available, but that adds complexity and makes the warehouse harder to maintain. It’s always possible to add more kinds of data later, but removing information goes against the idea of a data warehouse. In particular, personally sensitive information should be included only if it’s really necessary.
The first question should be: How will people use the data? What reports and visualizations will they need? The development of the plan should involve the users, not present them with a decree that may or may not be what their jobs require.
The roadmap will provide a list of the object types and relations which are needed. This is the raw information for creating a schema. If you start with a standard such as Ed-Fi, a lot of your schema work is already done. There are standard data representations, and sticking with them will let compatible applications work without problems.
However, there are always special data objects for a given charter school system. Their design should be as consistent with the data standard’s approach as possible.
After the schema comes the design of the ETL process. The details will depend on the data sources. If they’re clean and consistent, moving their information into the warehouse will be fairly straightforward. If they contain dirty data, it will be necessary either to find software that can clean up its inconsistencies or allow time for manual review. Having to fix up records manually is very time-consuming and should be a last resort.
Assessing an existing data warehouse
An existing data warehouse needs periodic review and updating to work well. If the original design had problems or years have gone by without an assessment, this is especially important. The assessment needs to cover usability, security, and privacy.
The first issue is the quality of the data. Are the representations inconsistent? Are there many gaps with no information? Problems like these lead to erroneous reading and skewed representation, making analyses unreliable. It isn’t always possible to improve the data, but at a minimum, the managers should know what parts have problems and what effects they might have.
The next question is whether the available reports and visualizations are sufficient for current needs. If users can’t get all the information they need, it could be time to upgrade the software or add new applications.
Whether the warehouse is a new one or is under review, security and privacy are crucial. It should meet all the requirements in this checklist:
- Data should be encrypted both in transit and at rest.
- The encryption algorithm should be strong and reliable under the conditions of modern computing.
- Encryption keys have to be well protected.
- Personally identifiable information (PII) needs to be restricted, to minimize the chances that a data breach will expose it.
- Access to data needs to be limited, and the list of people who have access needs prompt updating when they change their jobs.
- The assignment of access roles should use the principle of least privilege.
- Off-system use of data, especially PII, needs to be under strict security policies.
No system is 100% secure, but applying best practices at every level will minimize the chances of data loss.
Maintaining safety and security over time
Achieving safety and security isn’t a one-time task. It requires constant monitoring and periodic review. Regular audits of usability, privacy, and security are necessary. Usage patterns evolve, the regulatory environment changes, and new threats emerge. Changes to the system may create problems that its implementers don’t notice.
The consequences of a breach can be very serious. A charter school’s task is to take care of its students, and this includes protecting their privacy. A leak of Social Security numbers and financial data has the potential for serious monetary harm. It will cost the students’ families time and anxiety repairing the problem. Health-related information or psychological evaluations which leak out can mean a serious loss of privacy. Parents and students lose their trust in the school system when this happens.
Keeping a charter school’s data warehouse safe and secure isn’t an easy task, but it’s a vital one. It requires careful planning from the beginning and ongoing attention after it’s in place. Giving it the necessary attention minimizes its problems and maximizes its value.