Codementor Events

How to Ensure Data Integrity with Effective DSPM

Published Jul 25, 2023
How to Ensure Data Integrity with Effective DSPM

Introduction

Data integrity plays a crucial role in today’s data-driven world. It ensures the accuracy, consistency, and reliability of data throughout its lifecycle, making it a fundamental aspect of data quality and trustworthiness. Thus, organizations need to adopt effective Data Security Posture Management (DSPM) practices to stay relevant in this world where data is used everywhere.

In this article, we will explore the concept of data integrity, its importance in various domains, and delve into the key components of DSPM. Additionally, we will walk through the development of a DIY online file upload system, highlighting how to incorporate DSPM principles such as access controls, data encryption, and secure coding practices.

The Need for Data Integrity

Data integrity refers to the accuracy, consistency, and reliability of data throughout its lifecycle. It ensures that data remains complete, intact, and unaltered during storage, processing, and transmission. Maintaining data integrity is essential to ensuring data quality and trustworthiness. Let us look at the different reasons why we need data integrity:

Informed Decision Making: Accurate and reliable data enables trustworthy decision making as it provides a solid foundation for identifying trends and making strategic choices.

Regulatory Compliance: Maintaining data integrity is essential for complying with laws like GDPR, HIPAA, and SOX that are designed to protect sensitive information and prevent data breaches.

Customer Trust: Data integrity directly impacts customer confidence as customers feel assured that their information is accurate, secure, and unaltered.
Reliable Analytics and Reporting: Data integrity ensures accurate reports and meaningful data analysis, preventing flawed insights and misinformed decisions.
Data Consistency and Integration: With data integrity, organizations can seamlessly integrate and share data across systems and applications, fostering an efficient data-driven environment.

Data Recovery and Resilience: Data integrity measures like backups and error detection/correction help recover lost or corrupted data, minimizing the impact of system
dspm-2-1.jpg.webp
Source 

Data Security Posture Management

Data Security Posture Management (DSPM) refers to the practices and processes that organizations implement to assess, maintain, and enhance their overall data security posture. It involves a holistic approach to managing data security by considering various aspects such as data protection, access controls, risk management, threat intelligence, and compliance.

DSPM aims to ensure the confidentiality, integrity, and availability of data throughout its lifecycle. It involves proactive measures to identify potential risks, establish appropriate security controls, and continuously monitor and assess the effectiveness of those controls. By implementing DSPM, organizations can mitigate security risks, protect sensitive data, and maintain regulatory compliance.

The Key Components of DSPM

The majority of DSPM solutions typically consist of several key components that work together to establish a robust and proactive approach to data security. They allow organizations to protect their sensitive data and maintain a strong security posture. Let’s explore them here:

Risk Assessment
DSPM involves conducting comprehensive risk assessments to identify potential vulnerabilities, threats, and risks to data security. This involves evaluating the impact and likelihood of various risks and prioritizing mitigation efforts accordingly.

Access Controls
It implements access control mechanisms to enforce granular permissions and restrict unauthorized access to sensitive data. We define user roles and their respective access privileges using role-based access control (RBAC).

Data Encryption

Data encryption involves putting encryption techniques to use to protect data privacy and accuracy. It ensures that even if someone gains unauthorized access to your data, the data remains unintelligible without the decryption key.

Secure Coding Practices
DSPM involves utilizing secure coding techniques to produce dependable and robust software systems. These approaches cover procedures such as verifying inputs, encoding outputs, safely storing data, and protecting against common vulnerabilities like SQL injection.

Threat Intelligence
Threat intelligence is used to leverage threat intelligence feeds and tools to stay updated on the latest security threats, vulnerabilities, and attack techniques.

Continuous Monitoring and Incident Response
Setting up monitoring systems and incident response plans will help you quickly identify and address security incidents. This helps you check for vulnerabilities, spot anomalies, and flag security breaches.

Compliance Management
DSPM involves assessing compliance requirements, implementing necessary controls, and maintaining proper documentation for audits and regulatory purposes.

DSPM in an Online File Upload System
To understand DSPM better we are going to put it into practice in an online file upload system. The first step is to conduct a thorough risk assessment to identify potential vulnerabilities and risks associated with the online file upload system. Then we will evaluate the impact and likelihood of each risk and prioritize mitigation efforts accordingly.

Access Controls: Implement RBAC to control user access to different functionalities and data within the file upload system. Define roles such as “user”, “admin”, and “guest” with appropriate permissions. Users should only be allowed to upload and manage their own files, while admins can have full control over all files.

Data Encryption: Protect sensitive user files both in transit and at rest by using encryption techniques. To guarantee the secrecy and integrity of the data, use encryption methods like the AES. Encrypt the files before storing them on the server and decrypt them when accessed by authorized users.

Secure Coding Practices: Implement secure coding practices throughout the project to prevent common vulnerabilities. These practices may include input validation, file type verification, file size restrictions, secure file storage, threat Intelligence, continuous monitoring, and incident response.
By integrating these components into the online file upload system project, you can establish a more secure and robust data security posture.

Code Tutorial

Let’s walk through a code tutorial for implementing some components of DSPM in a Python online file storage system. We’ll cover access controls, data encryption, and secure coding practices. In a real-world scenario, you would need to consider additional security measures and requirements.

Access Controls
To implement access controls, we’ll create different user roles (user and admin) and restrict access to certain routes based on the user’s role.

user_roles = {
   'admin': ['upload_documents', 'get_document'],
   'user': ['upload_documents']
}
@app.route('/upload', methods=['POST'])
def upload_file():
 """Uploads a file to the doc storage."""
 user_role = request.headers.get('User-Role')
 if 'upload_documents' in user_roles.get(user_role, []):
   print(request.get_json())
   fileName = request.json["name"]
   fileContent = request.json["data"]
   if is_sensitive_data(fileName):
     encrypted_file_path = base_encrypted_file_path + "/" + fileName
     with open(encrypted_file_path,"w") as file:
       file.write(fileContent)
     encrypt_file(encrypted_file_path)
     print("Data encrypted as " + open(encrypted_file_path, 'r').read())
   return 'File uploaded successfully'
 else:
   return jsonify({'error': 'Access Denied'}), 403

@app.route('/get_file', methods=['GET'])
def get_file():
 """Gets a file from the doc storage."""
 user_role = request.headers.get('User-Role')
 if 'get_document' in user_roles.get(user_role, []):
   file_path = request.args.get('file_path')
   if is_sensitive_data(file_path):
     decrypt_file(file_path)
   print("Inside get_file function")
   file_data = open("./encrypted/" + file_path, 'r').read()
   print(file_data)
   return jsonify({'file_data': file_data})
 else:
   return jsonify({'error': 'Access Denied'}), 403


In this example, we define the user roles and their associated permissions in the user_roles dictionary. Each route checks the user's role obtained from the User-Role header, and if the user has the required permission, the corresponding functionality is executed. Otherwise, an "Access Denied" response with an HTTP status code 403 is returned.

Data Encryption
To implement data encryption, we’ll utilize the cryptography library to encrypt and decrypt sensitive user data. For example, if the file is a sensitive file, we encrypt the data before storing it in the file upload system. When the same user downloads the file, the data will be decrypted. In this example, we generate an encryption key using Fernet from the cryptography library.

from flask import Flask, request
from cryptography.fernet import Fernet

key = Fernet.generate_key()

# value of key is assigned to a variable
f = Fernet(key)

def encrypt_file(file_path):
   """Encrypts the given file."""
   with open(file_path, 'rb') as file:
       file_data = file.read()
       print("File data read as", file_data)
   encrypted_file_data = f.encrypt(file_data)
   print("encrypyted as tokyo : ",encrypted_file_data)
   with open(file_path, 'wb') as file:
       file.write(encrypted_file_data)

def decrypt_file(file_path):
   """Decrypts the given file."""
   file_path = "./encrypted/" + file_path
   with open(file_path, 'r') as file:
       encrypted_file_data = file.read()
       print("encrypted_file_data read as", encrypted_file_data)
   decrypted_file_data = f.decrypt(encrypted_file_data).decode('utf-8')
   print("decyrpted",decrypted_file_data)
   with open(file_path, 'w') as file:
       file.write(decrypted_file_data)

Secure Coding Practices
To incorporate secure coding practices, we’ll focus on input validation and SQL parameterization to prevent common vulnerabilities like SQL injection.

from flask import Flask, request
import sqlite3

app = Flask(__name__)
# Example: User registration route with input validation and SQL parameterization
@app.route('/register', methods=['POST'])
def register():
   username = request.form['username']
   password = request.form['password']
   # Input validation to prevent empty or malicious inputs
   if not username or not password:
       return 'Invalid username or password'
   try:
       # Establish a secure database connection
       conn = sqlite3.connect('database.db')
       cursor = conn.cursor()
       # SQL statement with parameterization to prevent SQL injection
       cursor.execute("INSERT INTO users (username, password) VALUES (?, ?)", (username, password))
       conn.commit()
       conn.close()
       return 'User registered successfully'
   except Exception as e:
       return f"An error occurred: {str(e)}"
if __name__ == '__main__':
   app.run(debug=True)
In this example, we validate the username and password inputs to ensure they are not empty. The SQL statement for inserting user data uses parameterization to prevent SQL injection attacks.


Now that we know how to implement these DSPM practices, let’s apply them in our code. Create a file called app.py and insert the following code:

from email import header
import os
import re
from flask import Flask, request, jsonify
from cryptography.fernet import Fernet

app = Flask(__name__)
# key is generated
key = Fernet.generate_key()
print(key)

# value of key is assigned to a variable
f = Fernet(key)

# Example user roles and access control
user_roles = {
   'admin': ['upload_documents', 'get_document'],
   'user': ['upload_documents']
}

base_encrypted_file_path = "./encrypted"

def data_discovery():
 """Discovers all sensitive data stored in the doc storage."""
 for root, directories, files in os.walk('.'):
   for file in files:
     file_path = os.path.join(root, file)
     if is_sensitive_data(file_path):
       print(f'{file_path} is sensitive data')

def is_sensitive_data(file_path):
 """Returns True if the given file path is sensitive data."""
 print("check-if-senstitive")
 sensitive_data_patterns = ['.*PII.*', '.*confidential.*']
 for pattern in sensitive_data_patterns:
   if re.search(pattern, file_path):
     return True
 return False

def encrypt_file(file_path):
   """Encrypts the given file."""
   with open(file_path, 'rb') as file:
       file_data = file.read()
       print("File data read as", file_data)
   encrypted_file_data = f.encrypt(file_data)
   print("encrypyted as tokyo : ",encrypted_file_data)
   with open(file_path, 'wb') as file:
       file.write(encrypted_file_data)

def decrypt_file(file_path):
   """Decrypts the given file."""
file_path = "./encrypted/" + file_path
   with open(file_path, 'r') as file:
       encrypted_file_data = file.read()
       print("encrypted_file_data read as", encrypted_file_data)
   decrypted_file_data = f.decrypt(encrypted_file_data).decode('utf-8')
   print("decyrpted",decrypted_file_data)
   with open(file_path, 'w') as file:
       file.write(decrypted_file_data)

@app.route('/upload', methods=['POST'])
def upload_file():
 """Uploads a file to the doc storage."""
 user_role = request.headers.get('User-Role')
 if 'upload_documents' in user_roles.get(user_role, []):
   print(request.get_json())
   fileName = request.json["name"]
   fileContent = request.json["data"]
   if is_sensitive_data(fileName):
     encrypted_file_path = base_encrypted_file_path + "/" + fileName
     with open(encrypted_file_path,"w") as file:
       file.write(fileContent)
     encrypt_file(encrypted_file_path)
     print("Data encrypted as " + open(encrypted_file_path, 'r').read())
   return 'File uploaded successfully'
 else:
   return 'Access Denied'

@app.route('/get_file', methods=['GET'])
def get_file():
 """Gets a file from the doc storage."""
 user_role = request.headers.get('User-Role')
 if 'get_document' in user_roles.get(user_role, []):
   file_path = request.args.get('file_path')
   if is_sensitive_data(file_path):
     decrypt_file(file_path)
   print("Inside get_file function")
   file_data = open("./encrypted/" + file_path, 'r').read()
   print(file_data)
   return jsonify({'file_data': file_data})
 else:
   return jsonify({'error': 'Access Denied'}), 403

@app.route('/users', methods=['GET'])
def get_users():
 """Gets all users."""
 return jsonify({'users': ['user1', 'user2']})

@app.route('/roles', methods=['GET'])
def get_roles():
 """Gets all roles."""
 return jsonify({'roles': ['admin', 'user']})

@app.route('/permissions', methods=['GET'])
def get_permissions():
 """Gets all permissions."""
 return jsonify({'permissions': ['read', 'write', 'delete']})

@app.route('/access_control', methods=['POST'])
def set_access_control():
 """Sets the access control for a file."""
 file_path = request.args.get('file_path')
 user = request.args.get('user')
 role = request.args.get('role')
 permission = request.args.get('permission')
 return jsonify({'success': True})

if __name__ == '__main__':
 app.run(debug=True)

The provided code demonstrates some aspects of DSPM principles. Here’s an explanation of how it covers these principles:

Data Discovery: The data_discovery() function is responsible for discovering sensitive data stored in the document storage. It traverses through directories and files, identifies sensitive data based on predefined patterns, and prints the paths of sensitive files. This function helps in identifying potential vulnerabilities and ensures proactive management of sensitive data.

Encryption: The code includes functions for file encryption (encrypt_file()) and decryption (decrypt_file()). These functions use the Fernet encryption scheme from the cryptography library to encrypt and decrypt file data using a generated key. Encrypting sensitive data helps protect it from unauthorized access and ensures data confidentiality.

Access Control: The set_access_control() function sets access control for a file based on parameters like file path, user, role, and permission. Proper access control ensures that only authorized individuals or roles can access sensitive data, minimizing the risk of unauthorized disclosure or modification.

To execute the app and demonstrate role-based access control, you can follow these steps:

  • Open a terminal or command prompt and navigate to the directory where the Python file is saved.
  • Install the required dependencies by running the following command:

pip install flask cryptography

Run the Python file to start the Flask application:
python app.py

The Flask application will start running locally on http://127.0.0.1:5000/. Use an HTTP client, such as Postman or cURL, to send requests to the application endpoints as different users to demonstrate the role-based access control. Here, we will use the requests library to send get and post requests to our APIs.

Create a file inference.py and insert the following code:

import requests

def upload_file(user_role, file_path):
 """Uploads a file to the doc storage."""
 file_data = open(file_path, 'r').read()
 payload = {
   "name" : file_path,
   "data" : file_data
 }
 headers = {'User-Role': user_role}
 response = requests.post('http://localhost:5000/upload', json=payload, headers=headers)
 assert response.status_code == 200

def get_file(user_role, file_path):
 """Gets a file from the doc storage."""
 headers = {'User-Role': user_role}
 response = requests.get('http://localhost:5000/get_file', params={'file_path': file_path, 'check_existence': True},
 headers=headers)
 if response.status_code == 200:
   new_downloaded_folder_path = "./downloads/"+file_path
   file_data = response.json()['file_data']
   open(new_downloaded_folder_path, 'w').write(file_data)
 else:
print(f'File {file_path} does not exist.')

def set_access_control():
 """Sets the access control for a file."""
 user = 'user1'
 role = 'admin'
 permission = 'read'
 response = requests.post('http://localhost:5000/access_control', params={'file_path': file_path, 'user': user, 'role': role, 'permission': permission})
 assert response.status_code == 200

if __name__ == '__main__':
 upload_file(user_role="user", file_path="my_file.txt")
 upload_file(user_role="admin", file_path="my_file.txt")
 print("upload file done")
 get_file(user_role="admin", file_path="my_file.txt")
 print("get file function")

 # Perform requests with different access controls
 upload_file(user_role="admin", file_path="my_file.txt")                       # Uploaded successfully for admin
 upload_file(user_role="user", file_path="my_file.txt")                        # Uploaded successfully for user

 upload_file(user_role="admin", file_path="my_confidential_file.txt")                       # Uploaded successfully for admin
 upload_file(user_role="user", file_path="my_confidential_file.txt") 

 get_file(user_role="admin", file_path="my_file.txt")                        # Access granted for admin
 get_file(user_role="admin", file_path="my_confidential_file.txt")  # Access granted for user

 get_file(user_role="user", file_path="my_confidential_file.txt")   # Access denied for user for confidential file

In this file, we try different requests for sensitive and non-sensitive files, with varying user roles. Data (be it sensitive or non-sensitive) can be uploaded by both the user and admin. Whether the data is sensitive or not is determined in the upload functionality.

To download data, which is the get file endpoint, only the admin can download confidential data. Non-sensitive files can be downloaded by both the user and admin. These access controls can be updated based on requirements.
To run this file, type the following command in your terminal:

python inference.py

Once you run the app, this will be output for app.py:
image1.png

Run the inference file:
image2.png

This is the app.py output:
image3.png

The above output will help you understand the flow of the code. All the code for the above application is available on GitHub.

Conclusion

Data integrity is crucial for organizations as it ensures the accuracy, consistency, and reliability of data throughout its lifecycle. By implementing effective DSPM practices, organizations can protect data integrity and enhance their overall data security posture. We learnt what data integrity is and why it is important. Thereafter, we discussed how DSPM is used to ensure data integrity and built an online file storage system implementing some of these DSPM techniques. I hope you found this article helpful and instructional.

Discover and read more posts from Kruti Chapaneri
get started
post commentsBe the first to share your opinion
Show more replies