<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Christopher Jones on Medium]]></title>
        <description><![CDATA[Stories by Christopher Jones on Medium]]></description>
        <link>https://medium.com/@cjones-oracle?source=rss-adc937c3a9d------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*p1cnWMLZICkK7MslxxG8Ow.jpeg</url>
            <title>Stories by Christopher Jones on Medium</title>
            <link>https://medium.com/@cjones-oracle?source=rss-adc937c3a9d------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Thu, 04 Jun 2026 08:13:20 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@cjones-oracle/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[New Python Jupyter Notebooks for python-oracledb]]></title>
            <link>https://cjones-oracle.medium.com/new-python-jupyter-notebooks-for-python-oracledb-e813c60bbf50?source=rss-adc937c3a9d------2</link>
            <guid isPermaLink="false">https://medium.com/p/e813c60bbf50</guid>
            <category><![CDATA[python-oracledb]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[learning]]></category>
            <category><![CDATA[oracle-database]]></category>
            <category><![CDATA[notebook]]></category>
            <dc:creator><![CDATA[Christopher Jones]]></dc:creator>
            <pubDate>Mon, 20 Oct 2025 03:04:58 GMT</pubDate>
            <atom:updated>2025-10-22T04:20:04.315Z</atom:updated>
            <content:encoded><![CDATA[<h4>The Jupyter notebooks for python-oracledb have been updated with the latest and greatest functionality to teach you how to use Python to connect to Oracle Database.</h4><p><strong>The notebooks are at </strong><a href="https://cjones-oracle.medium.com/python-jupyter-notebooks-for-python-oracledb-14f4f7958b58"><strong>github.com/oracle/python-oracledb/tree/main/samples/notebooks</strong></a></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*6K0RAtcOOEAcKDe8d-KxKQ.png" /></figure><h3>Python-oracledb Resources</h3><p>Python-oracledb is an open source package for the Python Database API specification with many additions to support advanced Oracle Database features. By default, it is a ‘Thin’ driver that is immediately usable without needing any additional install e.g. no Instant Client is required. Python-oracledb is used by frameworks, ORMs, SQL generation libraries, and other projects.</p><ul><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html">Installation instructions</a></li><li><a href="https://python-oracledb.readthedocs.io/en/latest/index.html">Documentation</a></li><li><a href="https://github.com/oracle/python-oracledb/discussions">Discussions</a></li><li><a href="https://github.com/oracle/python-oracledb">Source code on GitHub</a></li><li><a href="https://pypi.org/project/oracledb/">PyPI</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=e813c60bbf50" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[The python-oracledb driver and Oracle AI Database 26ai]]></title>
            <link>https://medium.com/oracledevs/the-python-oracledb-driver-and-oracle-ai-database-26ai-1efd6df33565?source=rss-adc937c3a9d------2</link>
            <guid isPermaLink="false">https://medium.com/p/1efd6df33565</guid>
            <category><![CDATA[features]]></category>
            <category><![CDATA[database]]></category>
            <category><![CDATA[artificial-intelligence]]></category>
            <category><![CDATA[python-oracledb]]></category>
            <category><![CDATA[python]]></category>
            <dc:creator><![CDATA[Christopher Jones]]></dc:creator>
            <pubDate>Wed, 15 Oct 2025 11:01:39 GMT</pubDate>
            <atom:updated>2025-10-16T15:09:33.340Z</atom:updated>
            <content:encoded><![CDATA[<h4><a href="https://blogs.oracle.com/database/post/oracle-announces-oracle-ai-database-26ai">Oracle AI Database 26ai has been announced</a><em>: “Oracle AI Database 26ai replaces Oracle Database 23ai. Transitioning from 23ai to 26ai is simple — just apply the October 2025 release update with no database upgrade or application re-certification. Advanced AI features like AI Vector Search are included at no additional charge.”</em></h4><figure><img alt="Oracle’s traditional cylindrical database icon (mimicing a stack of disk platters and also Oracle’s headquarters building design) with the wording “26ai” overlaid." src="https://cdn-images-1.medium.com/max/404/1*O8-6bUPb5-2v1cnpT4y8Aw.jpeg" /></figure><p>The database version is now of the form 23.26.x.y.z where “x” becomes the release update number. Since the Oracle Client libraries are a subset of the database libraries, users of Oracle’s Python driver <a href="https://pypi.org/project/oracledb/">python-oracledb</a> (when running in “Thick mode”) will see a similar, updated Client library version. The Python code:</p><pre>import oracledb<br><br>oracledb.init_oracle_client()<br>print(f&#39;Oracle Client version {oracledb.clientversion()}&#39;)<br>with oracledb.connect(user=&#39;cj&#39;, password=&#39;cj&#39;, dsn=&#39;localhost/orclpdb1&#39;) as connection:<br>    print(f&#39;Oracle AI Database version {connection.version}&#39;)</pre><p>will print:</p><pre>$ python v.py<br>Oracle Client version (23, 26, 0, 0, 0)<br>Oracle AI Database version 23.26.0.0.0</pre><p>Oracle <a href="https://www.oracle.com/database/technologies/instant-client.html">Instant Client</a> will unzip to the directory instantclient_23_26.</p><p>To read more about the versioning scheme and updating to 26ai, check out Mike Dietrich’s blog post <a href="https://mikedietrichde.com/2025/10/14/oracle-ai-database-26ai-replaces-oracle-database-23ai/">Oracle AI Database 26ai replaces Oracle Database 23ai</a></p><h3>What do you get in Python?</h3><p>It’s a great time to review some of the fantastic Oracle AI Database 26ai and related Oracle Cloud Infrastructure features that are supported by <a href="https://pypi.org/project/oracledb/">python-oracledb</a>. These include:</p><ul><li><a href="https://download.oracle.com/ocomdocs/global/python-oracledb_Vector_Technical_Brief_Oracle_Database_23ai.pdf">AI VECTOR data type</a> and AI Vector Search for AI applications</li><li><a href="https://medium.com/oracledevs/sessionless-transactions-in-python-oracledb-speed-without-strings-b218c6ff62f2">Sessionless Transactions</a> to decouple transactions from connections</li><li><a href="https://medium.com/oracledevs/the-best-of-relational-and-json-at-the-same-time-62f9ba291c98">JSON Duality Views</a> to let developers and SQL lovers have the best of both worlds</li><li><a href="https://cjones-oracle.medium.com/pipelined-database-operations-with-python-oracledb-2-4-dd8bfbdadf0d">Pipelining</a> to improve statement throughput and keep the applications and database working without delays</li><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/connection_handling.html#coding-applications-to-use-drcp">Multi-pool DRCP</a> for better workload partitioning</li><li><a href="https://cjones-oracle.medium.com/implicit-connection-pooling-when-connections-overload-your-database-3fe7c59acae2">Implicit Connection pooling</a> for scaling of legacy applications</li><li><a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/fetch_info.html#oracledb.FetchInfo.annotations">SQL Annotations</a> for centralized application metadata</li><li><a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/fetch_info.html#oracledb.FetchInfo.domain_name">Data Use Case Domains</a> for properties and constraints on the data schema</li><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/authentication_methods.html#instance-principal-authentication">OCI Instance Principal authentication</a> for automatic database authentication from authorized Oracle Cloud Infrastructure compute instances</li><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/connection_handling.html#centralized-configuration-providers">Configuration Providers</a> for centralized credential management and application configuration</li><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/authentication_methods.html#oci-cloud-native-authentication-with-the-oci-tokens-plugin">Cloud Native Authentication</a> for database authentication with tokens</li><li><a href="https://medium.com/db-one/a-deep-dive-into-binary-json-formats-oson-e3190e5e9eb0">Oracle’s fast, efficient OSON internal representation of JSON</a> to accelerate your modern applications</li><li><a href="https://docs.oracle.com/en-us/iaas/autonomous-database-serverless/doc/connection-tcp-fast-open.html#GUID-1AEBAEF0-454F-445E-A4BC-40E068D6D86F">TCP Fast Open</a> for faster connection usage with Oracle Autonomous AI Database</li></ul><p>There are many other recent changes in the database world that bring benefit to all applications: faster connection protocols, lots of new SQL syntax features, lots of improved database features, updated client tooling, and a focus on developers. You can find details in the <a href="https://docs.oracle.com/en/database/oracle/oracle-database/26/nfcoa/">Oracle AI Database New Features manual</a>.</p><p>Python-oracledb itself has new, important features that complement Oracle AI Database but can also be used with older database releases:</p><ul><li><a href="https://medium.com/oracledevs/using-python-for-data-analysis-and-ai-3e07e599e27b">DataFrame support</a> for fast analysis and AI use with your favorite Python modules</li><li><a href="https://medium.com/oracledevs/direct-path-loads-fast-data-ingestion-with-python-and-oracle-database-c681fb60384f">Direct Path Loads</a> for fast data ingestion and ETL workloads</li></ul><p>Overall, Oracle AI Database 26ai and the python-oracledb driver bring great technology to your business, improving efficiency and powering your work with AI agents, MCP, and vectors.</p><h3>Python-oracledb Resources</h3><p>Python-oracledb is an open source package for the Python Database API specification with many additions to support advanced Oracle Database features. By default, it is a ‘Thin’ driver that is immediately usable without needing any additional install e.g. no Instant Client is required. Python-oracledb is used by frameworks, ORMs, SQL generation libraries, and other projects.</p><ul><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html">Installation instructions</a></li><li><a href="https://python-oracledb.readthedocs.io/en/latest/index.html">Documentation</a></li><li><a href="https://github.com/oracle/python-oracledb/discussions">Discussions</a></li><li><a href="https://github.com/oracle/python-oracledb">Source code on GitHub</a></li><li><a href="https://pypi.org/project/oracledb/">PyPI</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=1efd6df33565" width="1" height="1" alt=""><hr><p><a href="https://medium.com/oracledevs/the-python-oracledb-driver-and-oracle-ai-database-26ai-1efd6df33565">The python-oracledb driver and Oracle AI Database 26ai</a> was originally published in <a href="https://medium.com/oracledevs">Oracle Developers</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[OpenTelemetry with Python and Oracle Database]]></title>
            <link>https://medium.com/oracledevs/opentelemetry-with-python-and-oracle-database-bd5f06f5a78a?source=rss-adc937c3a9d------2</link>
            <guid isPermaLink="false">https://medium.com/p/bd5f06f5a78a</guid>
            <category><![CDATA[python-oracledb]]></category>
            <category><![CDATA[opentelemetry]]></category>
            <category><![CDATA[monitoring]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[database]]></category>
            <dc:creator><![CDATA[Christopher Jones]]></dc:creator>
            <pubDate>Thu, 09 Oct 2025 21:14:15 GMT</pubDate>
            <atom:updated>2025-10-09T21:31:54.149Z</atom:updated>
            <content:encoded><![CDATA[<h4>Python Oracle Database applications can easily be integrated with OpenTelemetry for monitoring and troubleshooting.</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*et0QOcIfOe2g3DY2gD8t0A.png" /></figure><p>With the growing demand to monitor and trace fleets of applications, an observability framework becomes important. OpenTelemetry is the most widely known, being usable across a wide range of environments and languages.</p><p>OpenTelemetry <a href="https://opentelemetry.io/docs/what-is-opentelemetry/">describes itself</a> as:</p><blockquote><em>An observability framework and toolkit designed to facilitate the</em></blockquote><blockquote>- Generation</blockquote><blockquote>- Export</blockquote><blockquote>- Collection</blockquote><blockquote><em>of telemetry data such as traces, metrics, and logs.</em></blockquote><p>It is open source, and vendor- and tool-agnostic.</p><p>Python modules that enable sophisticated OpenTelemetry support are widely used. In particular, the OpenTelemetry Database API Instrumentation package opentelemetry-instrumentation-dbapi (documentation <a href="https://opentelemetry-python-contrib.readthedocs.io/en/latest/instrumentation/dbapi/dbapi.html#module-opentelemetry.instrumentation.dbapi">here</a>) provides observability integration for Python DB API V2 compliant drivers. This makes tracing of Oracle’s open source <a href="https://pypi.org/project/oracledb/">python-oracledb driver</a> very easy, giving you rapid, ongoing insight into your applications.</p><h4>Basic Tracing</h4><p>Start by installing required packages:</p><pre>python -m pip install opentelemetry-sdk opentelemetry-api opentelemetry-instrumentation-dbapi oracledb</pre><p>The simplest tracing available is to display logging output to your terminal. This is done by adding a ConsoleSpanExporter() and trace integration with the oracledb module to the top of your script. For example:</p><pre># opentelemetry-db-1.py<br><br>import os<br><br>import oracledb<br><br>from opentelemetry import trace<br>from opentelemetry.sdk.trace import TracerProvider<br>from opentelemetry.sdk.trace.export import (<br>    BatchSpanProcessor,<br>    ConsoleSpanExporter,<br>)<br>from opentelemetry.instrumentation.dbapi import trace_integration<br><br>provider = TracerProvider()<br>processor = BatchSpanProcessor(ConsoleSpanExporter())<br>provider.add_span_processor(processor)<br>trace.set_tracer_provider(provider)<br><br>trace_integration(oracledb, &quot;connect&quot;, &quot;oracle&quot;)<br><br># Database credentials and connection string<br>un = os.environ.get(&quot;PYTHON_USERNAME&quot;)<br>pw = os.environ.get(&quot;PYTHON_PASSWORD&quot;)<br>cs = os.environ.get(&quot;PYTHON_CONNECTSTRING&quot;)<br><br>with oracledb.connect(user=un, password=pw, dsn=cs) as connection:<br>    with connection.cursor() as cursor:<br>        sql = &quot;select &#39;hello&#39; from dual&quot;<br>        for r, in cursor.execute(sql):<br>            print(r)</pre><p>You can see that the connection and query code doesn’t have additional hooks or tracing annotations. Tracing is automatically performed.</p><p>Set your credential environment variables and run the script:</p><pre>export PYTHON_USERNAME=cj<br>export PYTHON_PASSWORD=...<br>export PYTHON_CONNECTSTRING=localhost/freepdb1<br><br>python opentelemetry-db-1.py</pre><p>The output shows the expected query result “hello” and an OpenTelemetry span named “select” which records information about the query execution:</p><pre>hello<br>{<br>    &quot;name&quot;: &quot;select&quot;,<br>    &quot;context&quot;: {<br>        &quot;trace_id&quot;: &quot;0x02aa04b49d900dcadaf577fb9e5f15a3&quot;,<br>        &quot;span_id&quot;: &quot;0x0ca4c709bf0ac7d7&quot;,<br>        &quot;trace_state&quot;: &quot;[]&quot;<br>    },<br>    &quot;kind&quot;: &quot;SpanKind.CLIENT&quot;,<br>    &quot;parent_id&quot;: null,<br>    &quot;start_time&quot;: &quot;2025-10-05T22:42:33.460805Z&quot;,<br>    &quot;end_time&quot;: &quot;2025-10-05T22:42:33.881186Z&quot;,<br>    &quot;status&quot;: {<br>        &quot;status_code&quot;: &quot;UNSET&quot;<br>    },<br>    &quot;attributes&quot;: {<br>        &quot;db.system&quot;: &quot;oracle&quot;,<br>        &quot;db.name&quot;: &quot;&quot;,<br>        &quot;db.statement&quot;: &quot;select &#39;hello&#39; from dual&quot;<br>    },<br>    &quot;events&quot;: [],<br>    &quot;links&quot;: [],<br>    &quot;resource&quot;: {<br>        &quot;attributes&quot;: {<br>            &quot;telemetry.sdk.language&quot;: &quot;python&quot;,<br>            &quot;telemetry.sdk.name&quot;: &quot;opentelemetry&quot;,<br>            &quot;telemetry.sdk.version&quot;: &quot;1.27.0&quot;,<br>            &quot;service.name&quot;: &quot;unknown_service&quot;<br>        },<br>        &quot;schema_url&quot;: &quot;&quot;<br>    }<br>}</pre><p>The trace contains metadata that is useful for understanding when the event occurred and what was happening. The start and end times can be used to calculate how long the query took. The trace also has identifier information that helps correlating nested spans in bigger applications.</p><p>OpenTelemetry Database API Instrumentation also lets you automatically track unexpected events. Edit opentelemetry-db-1.py and force an error (but don&#39;t show the message in application output) by adding this:</p><pre>        try:<br>            sql = &quot;insert into doesnotexist values (1)&quot;<br>            cursor.execute(sql)<br>        except:<br>            print(&quot;Whoops&quot;)</pre><p>The output will now contain a second OpenTelemetry span, this time named by default as “insert”. Even though the script itself “swallowed” the error and just displays “Whoops” to the console as normal output , the OpenTelemetry span contains the real error information:</p><pre>hello<br>Whoops<br>{<br>    &quot;name&quot;: &quot;select&quot;,<br>    . . .<br>}<br>{<br>    &quot;name&quot;: &quot;insert&quot;,<br><br>    . . .<br><br>    &quot;status&quot;: {<br>        &quot;status_code&quot;: &quot;ERROR&quot;,<br>        &quot;description&quot;: &quot;DatabaseError: ORA-00942: table or view \&quot;CJ\&quot;.\&quot;DOESNOTEXIST\&quot; does not exist\nHelp: https://docs.oracle.com/error-help/db/ora-00942/&quot;<br>    },<br><br>    . . .<br>}</pre><p>There will also be a stack trace in the span, which you can see when you run it yourself.</p><p>This error information capture allows you to remotely monitor and troubleshoot unexpected events without even having to explicitly log them.</p><h4>Customizing Tracing</h4><p>There are various attributes that can be set to enhance your traces. Guidelines are at <a href="https://opentelemetry-python-contrib.readthedocs.io/en/latest/instrumentation/dbapi/dbapi.html">OpenTelemetry Database API Instrumentation</a> and <a href="https://opentelemetry.io/docs/specs/semconv/database/database-spans/">Semantic conventions for database client spans</a>.</p><p>The following example sets some attributes to customize the output of a query from the <a href="https://github.com/cjbj/db-sample-schemas/tree/master/human_resources">sample </a><a href="https://github.com/cjbj/db-sample-schemas/tree/master/human_resources">LOCATIONS table</a>:</p><pre># opentelemetry-db-2.py<br><br>import os<br><br>import oracledb<br><br>from opentelemetry import trace<br>from opentelemetry.sdk.trace import TracerProvider<br>from opentelemetry.sdk.trace.export import (<br>    BatchSpanProcessor,<br>    ConsoleSpanExporter,<br>)<br>from opentelemetry.sdk.resources import Resource<br>from opentelemetry.instrumentation.dbapi import trace_integration<br><br>un = os.environ.get(&#39;ORACLE_USERNAME&#39;)<br>pw = os.environ.get(&#39;ORACLE_PASSWORD&#39;)<br>cs = os.environ.get(&#39;ORACLE_DSN&#39;)<br><br>resource = Resource(attributes={<br><br>    # displayed as attributes.service.name<br>    &quot;service.name&quot;: &quot;flightBooking&quot;,<br><br>    # displayed as resource.attributes.db.name<br>    &quot;db.name&quot;: &quot;myOracleDB&quot;<br>})<br><br>provider = TracerProvider(resource=resource)<br>processor = BatchSpanProcessor(ConsoleSpanExporter())<br>provider.add_span_processor(processor)<br>trace.set_tracer_provider(provider)<br><br>trace_integration(<br>    oracledb,<br><br>    connect_method_name=&quot;connect&quot;,<br><br>    # displayed as attributes.db.system<br>    database_system=&quot;Oracle ADB&quot;,<br><br>    # displayed as attributes.db.statement.parameters<br>    # !! SECURITY WARNING: shows bind variable values !!<br>    capture_parameters=True,<br>)<br><br>with oracledb.connect(user=un, password=pw, dsn=cs) as connection:<br>    with connection.cursor() as cursor:<br>        sql = &quot;select city from locations where location_id = :1&quot;<br>        for r, in cursor.execute(sql, [2200]):<br>            print(r)</pre><p>Output is like:</p><pre>Sydney<br>{<br>    &quot;name&quot;: &quot;select&quot;,<br><br>    . . .<br><br>    &quot;attributes&quot;: {<br>        &quot;db.system&quot;: &quot;Oracle ADB&quot;,<br>        &quot;db.name&quot;: &quot;&quot;,<br>        &quot;db.statement&quot;: &quot;select city from locations where location_id = :1&quot;,<br>        &quot;db.statement.parameters&quot;: &quot;[2200]&quot;<br>    },<br>    &quot;events&quot;: [],<br>    &quot;links&quot;: [],<br>    &quot;resource&quot;: {<br>        &quot;attributes&quot;: {<br>            &quot;service.name&quot;: &quot;flightBooking&quot;,<br>            &quot;db.name&quot;: &quot;myOracleDB&quot;<br>        },<br>        &quot;schema_url&quot;: &quot;&quot;<br>    }<br>}</pre><p>Note that setting capture_parameters caused the bind variable value &quot;2200&quot; to be recorded which is a <strong>security issue</strong>. Only use this setting in limited development circumstances, not in production applications.</p><h4>Tracing extended python-oracledb functionality</h4><p>Python-oracledb has great functionality that is not part of the Python DB API standard and therefore is not automatically instrumented by opentelemetry-instrumentation-dbapi.</p><p>To show this, edit opentelemetry-db-2.py and add a query that uses <a href="https://medium.com/oracledevs/using-python-for-data-analysis-and-ai-3e07e599e27b">python-oracledb&#39;s fantastic DataFrame</a> method fetch_df_all():</p><pre>        sql = &quot;select city from locations where country_id = :1&quot;<br>        odf = connection.fetch_df_all(sql, [&#39;UK&#39;])<br>        print(odf.num_rows())</pre><p>The script output now displays an additional 3 from the print() statement (because the <a href="https://github.com/cjbj/db-sample-schemas/tree/master/human_resources">sample table</a> I used contains three cities in the UK), but only the original OpenTelemetry span for the previous connection.execute() call is shown:</p><pre>Sydney<br>3<br>{<br>    &quot;name&quot;: &quot;select&quot;,<br><br>    . . .<br><br>    &quot;attributes&quot;: {<br>        &quot;db.system&quot;: &quot;Oracle ADB&quot;,<br>        &quot;db.name&quot;: &quot;&quot;,<br>        &quot;db.statement&quot;: &quot;select city from locations where location_id = :1&quot;,<br>        &quot;db.statement.parameters&quot;: &quot;[2200]&quot;<br>    },<br><br>    . . .<br>}</pre><p>To trace calls such as fetch_df_all()that extend the Python DB API, you can add explicit instrumentation. Change the DataFrame query code to be:</p><pre>        tracer = trace.get_tracer(__name__)<br>        with tracer.start_as_current_span(&quot;myDFQuery&quot;):<br>            sql = &quot;select city from locations where country_id = :1&quot;<br>            odf = connection.fetch_df_all(sql, [&#39;UK&#39;])<br>            print(odf.num_rows())</pre><p>The output of the script now has an extra span for the DataFrame operation:</p><pre>. . .<br>{<br>    &quot;name&quot;: &quot;myDFQuery&quot;,<br>    &quot;context&quot;: {<br>        &quot;trace_id&quot;: &quot;0x8512a9fac568c07fc16cd872f68d0346&quot;,<br>        &quot;span_id&quot;: &quot;0x03f424111825540f&quot;,<br>        &quot;trace_state&quot;: &quot;[]&quot;<br>    },<br>    &quot;kind&quot;: &quot;SpanKind.INTERNAL&quot;,<br>    &quot;parent_id&quot;: null,<br>    &quot;start_time&quot;: &quot;2025-10-06T01:20:34.200129Z&quot;,<br>    &quot;end_time&quot;:   &quot;2025-10-06T01:20:39.212618Z&quot;,<br>    &quot;status&quot;: {<br>        &quot;status_code&quot;: &quot;UNSET&quot;<br>    },<br>    &quot;attributes&quot;: {},<br>    &quot;events&quot;: [],<br>    &quot;links&quot;: [],<br>    &quot;resource&quot;: {<br>        &quot;attributes&quot;: {<br>            &quot;service.name&quot;: &quot;flightBooking&quot;,<br>            &quot;db.name&quot;: &quot;myOracleDB&quot;<br>        },<br>        &quot;schema_url&quot;: &quot;&quot;<br>    }<br>}</pre><h4>Recording and Visualizing OpenTelemetry Traces</h4><p>All the samples above wrote OpenTelemetry tracing straight to the console. However it is more useful to store traces and use them for later analysis, or for aggregation over longer time periods. There are various recording and graphical tools that can be used with OpenTelemetry, for example Zipkin, Prometheus, Grafana, and Jaeger. Here I will show <a href="https://zipkin.io/">Zipkin</a>.</p><p>The easiest way to get a Zipkin instance is in a container:</p><pre>docker run -d -p 9411:9411 --name zipkin openzipkin/zipkin</pre><p>For the application, install the additional Python exporter module for Zipkin:</p><pre>python -m pip install opentelemetry-exporter-zipkin</pre><p>Copy and edit opentelemetry-db-2b.py to add an import for ZipkinExporter. Change the BatchSpanProcessor to use it. The file diff is:</p><pre>--- opentelemetry-db-2b.py 2025-10-06 12:52:25<br>+++ opentelemetry-db-2b-zipkin.py 2025-10-06 12:53:01<br>@@ -1,14 +1,14 @@<br>-# opentelemetry-db-2b.py<br>+# opentelemetry-db-2b-zipkin.py<br> <br> import os<br> <br> import oracledb<br> <br> from opentelemetry import trace<br>+from opentelemetry.exporter.zipkin.json import ZipkinExporter<br> from opentelemetry.sdk.trace import TracerProvider<br> from opentelemetry.sdk.trace.export import (<br>     BatchSpanProcessor,<br>-    ConsoleSpanExporter,<br> )<br> from opentelemetry.sdk.resources import Resource<br> from opentelemetry.instrumentation.dbapi import trace_integration<br>@@ -27,7 +27,7 @@<br> })<br> <br> provider = TracerProvider(resource=resource)<br>-processor = BatchSpanProcessor(ConsoleSpanExporter())<br>+processor = BatchSpanProcessor(ZipkinExporter())<br> provider.add_span_processor(processor)<br> trace.set_tracer_provider(provider)</pre><p>When the docker ps command shows the Zipkin container is &quot;healthy&quot;, run your edited Python script. The result is just the application output:</p><pre>Sydney<br>3</pre><p>The tracing information has been exported to the Zipkin instance in the background.</p><p>Now you can load http://localhost:9411/zipkin/ in a browser. This gives you an overview of what happened in the app:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ZgqD3Sj6WFuBaXUyoONpoQ.png" /></figure><p>You can drill into each span, for example by clicking SHOW for the first query:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*aV2lvRZ6Cdzk15EI-Zlcag.png" /></figure><p>The bind variable value is shown since capture_parameters=True is still set. As a reminder <strong>don’t</strong> do this in production applications.</p><h4>Summary</h4><p>OpenTelemetry is an observability framework that can be used across a wide variety of applications. Comprehensive Python packages let you rapidly enable it in python-oracledb applications for monitoring and troubleshooting. The python-oracledb driver can be instrumented automatically, or with custom tracing calls.</p><p>Various backend systems can be used record and analyze OpenTelemetry information, letting you drill into nested spans to find slow code areas, letting you aggregate data, and letting you enable custom alerts.</p><p>You may also be interested in <a href="https://medium.com/oracledevs/integrate-opentelemetry-to-build-high-performance-oracle-database-applications-with-node-js-118e3a6c8793">our work to add OpenTelemetry support to Node.js</a>.</p><h4>Python-oracledb Resources</h4><p>Python-oracledb is an open source package for the Python Database API specification with many additions to support advanced Oracle Database features. By default, it is a ‘Thin’ driver that is immediately usable without needing any additional install e.g. no Instant Client is required. Python-oracledb is used by frameworks, ORMs, SQL generation libraries, and other projects.</p><ul><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html">Installation instructions</a></li><li><a href="https://python-oracledb.readthedocs.io/en/latest/index.html">Documentation</a></li><li><a href="https://github.com/oracle/python-oracledb/discussions">Discussions</a></li><li><a href="https://github.com/oracle/python-oracledb">Source code on GitHub</a></li><li><a href="https://pypi.org/project/oracledb/">PyPI</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=bd5f06f5a78a" width="1" height="1" alt=""><hr><p><a href="https://medium.com/oracledevs/opentelemetry-with-python-and-oracle-database-bd5f06f5a78a">OpenTelemetry with Python and Oracle Database</a> was originally published in <a href="https://medium.com/oracledevs">Oracle Developers</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Direct Path Loads: Fast data ingestion with Python and Oracle Database]]></title>
            <link>https://medium.com/oracledevs/direct-path-loads-fast-data-ingestion-with-python-and-oracle-database-c681fb60384f?source=rss-adc937c3a9d------2</link>
            <guid isPermaLink="false">https://medium.com/p/c681fb60384f</guid>
            <category><![CDATA[oracle-database]]></category>
            <category><![CDATA[database]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[data]]></category>
            <category><![CDATA[etl]]></category>
            <dc:creator><![CDATA[Christopher Jones]]></dc:creator>
            <pubDate>Tue, 07 Oct 2025 04:25:20 GMT</pubDate>
            <atom:updated>2025-10-07T14:04:08.563Z</atom:updated>
            <content:encoded><![CDATA[<h4>The fastest way to load very large datasets with Python into Oracle Database is python-oracledb’s <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> method. This can radically improve your <a href="https://en.wikipedia.org/wiki/Extract%2C_transform%2C_load">ETL</a> workflow performance.</h4><figure><img alt="A bar graph. Each bar has a different color. The trend is for each successive bar to be shorter than the previous one. There are no numbers or captions to describe the bars. Overlaid text says “Direct Path Loads. Fast data ingestion”" src="https://cdn-images-1.medium.com/max/1024/1*pyxxN69oraQ-fqfdoeeRtQ.png" /></figure><p>This blog benchmarks various ways to load data into Oracle Database using Python.</p><h3>The Background</h3><p><a href="https://en.wikipedia.org/wiki/Extract%2C_transform%2C_load">ETL</a> (“Extract, Transform, and Load”) pipelines are common in Python, specially in the world of data analysis and AI. Performance is critical, so how can you improve data loading times?</p><p><strong>Direct Path Loading</strong> is a database feature most commonly known from <a href="https://docs.oracle.com/en/database/oracle/oracle-database/23/sutil/oracle-sql-loader.html">SQL*Loader</a>, but also exposed by JDBC and ODP.NET — and now by python-oracledb. It allows data being inserted into an Oracle Database table to bypass code layers such as the database buffer cache. Also there are no INSERT statements used. Direct Path Loads allow very fast ingestion of huge amounts of data.</p><p>Direct Path Load support was introduced in python-oracledb 3.4 with a <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> method off the connection object. It lets you pass in a list of sequences or a DataFrame. For example with a list:</p><pre>DATA = [<br>    (1, &quot;First row&quot;),<br>    (2, &quot;Second row&quot;),<br>    (3, &quot;Third row&quot;),<br>]<br><br>connection.direct_path_load(<br>    schema_name=&quot;HR&quot;,<br>    table_name=&quot;mytab&quot;,<br>    column_names=[&quot;id&quot;, &quot;name&quot;]<br>    data=DATA<br>)</pre><p>The API is simple. You pass in the schema, table name, column names, and the data. There is only one additional argument available in this initial release: a batch_size parameter that is used to split the processing of the supplied data into chunks of that number of rows, allowing you to more easily tune the performance without having to explicitly loop and make multiple calls to <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a>. The method is supported only in python-oracledb Thin mode - there are no plans to add it to Thick mode.</p><p>The API documentation is <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">here</a>. The User Guide is <a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/batch_statement.html#direct-path-loads">here</a>.</p><p>As a consequence of the specialized Database architecture, there are a few restrictions on when Direct Path Loads can be used: check the documentation on SQL*Loader <a href="https://www.oracle.com/pls/topic/lookup?ctx=dblatest&amp;id=GUID-0D576DEF-7918-4DD2-A184-754D217C021F">Direct Path Loads</a> and on the Oracle Call Interface <a href="https://www.oracle.com/pls/topic/lookup?ctx=dblatest&amp;id=GUID-596F5F9B-47A1-48DB-8702-FEED7BE038B9">Direct Path Load Interface</a>.</p><h4>The CSV file, the database table, the benchmark</h4><p>I decided to measure not just insertion time, but the whole time to ingest a CSV file into an Oracle Database table. This is a task I know a lot of you are doing.</p><p><strong>The CSV files</strong>: I used files with 100,000 lines, with 1,000,000 lines, and with 2,000,000 lines. The CSV files were like:</p><pre>1,&quot;23-Sep-2025&quot;,&quot;String for row 1&quot;<br>2,&quot;23-Sep-2025&quot;,&quot;String for row 2&quot;<br>. . .</pre><p>The simple code I used to create the files is <a href="https://gist.github.com/cjbj/f4b605cb7db9acdf10f4ff3a2ba58d4d">here</a>.</p><p><strong>The table</strong>: The database table was created like:</p><pre>create table mytab (id number, dt date, name varchar2(50));</pre><p><strong>The code</strong>: Each test read the whole file into Python memory, and wrote it to database in one go. My code was actually structured to allow loading in smaller batches so I could experiment with streaming data. In a real world scenario you might have too much data to hold in memory, or to efficiently send across the network in one operation. But, for this blog, I kept it simple and did everything in one shot. Wherever you see BATCH_SIZE or BLOCK_SIZE in the code snippets, assume that this is set to the appropriate value to read and insert all rows at once.</p><p>The complete code can be found <a href="https://gist.github.com/cjbj/8c6a3e2646bccfcf2bd8d41f64974b1f">here</a>.</p><h4>The data ingestion choices</h4><p>There are three ways I used to read data from a CSV file into Python:</p><ul><li>With the Pandas <a href="https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html">read_csv()</a> method which reads into a DataFrame</li><li>With the standard <a href="https://docs.python.org/3/library/csv.html">Python library </a><a href="https://docs.python.org/3/library/csv.html">csv module</a> which returns each row as a list of strings</li><li>With <a href="https://arrow.apache.org/docs/python/generated/pyarrow.csv.read_csv.html#pyarrow.csv.read_csv">PyArrow’s csv</a> methods which read into a PyArrow Table interoperable with a DataFrame</li></ul><p>After data was in Python memory, there were five ways to insert into the database:</p><ul><li>Calling the Pandas <a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html">to_sql()</a> method</li><li>Passing a list to python-oracledb’s <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a> method</li><li>Passing a DataFrame to python-oracledb’s <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a> method</li><li>Passing a list to python-oracledb’s <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> method</li><li>Passing a DataFrame to python-oracledb’s <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> method</li></ul><p>Obviously, not all combinations are possible.</p><h3>The Solutions</h3><p>I tried five solutions, which are listed here in slowest-to-fastest order. The complete code can be found <a href="https://gist.github.com/cjbj/8c6a3e2646bccfcf2bd8d41f64974b1f">here</a>. The results are shown at the bottom.</p><h4>1. Loading CSV Data with Pandas</h4><p>Though simple to code, using Pandas to load data is the slowest solution — and not just because of extra initial overhead checking the schema. So unless you need specific Pandas data loading functionality, or performance really isn’t important, you should not use it.</p><p>Code is like:</p><pre>csv_reader = pandas.read_csv(<br>    &quot;sample.csv&quot;,<br>    header=None,<br>    names=[&quot;id&quot;, &quot;dt&quot;, &quot;name&quot;],<br>    parse_dates=[&#39;dt&#39;],<br>    chunksize=BATCH_SIZE)<br>for d in csv_reader:<br>    d.to_sql(&quot;mytab&quot;, engine, if_exists=&#39;append&#39;, index=False)</pre><p>To be roughly comparable with other solutions, I pre-created the table, so I used the append mode of Pandas.</p><p>With large data sets, the overhead Pandas incurs checking the schema before inserting data is relatively lower, but it is still a factor if ultimate speed is your goal.</p><h4>2. Using Python’s csv module to read into a list, and executemany() to insert</h4><p>Using the standard Python csv module and calling Oracle’s efficient <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a> method used to be the “best-practice” solution but now comes in as second slowest.</p><p>Code is like:</p><pre>cursor.setinputsizes(None, None, 50)<br><br>sql = &quot;insert into mytab (id, dt, name) values (:1, :2, :3)&quot;<br>data = []<br>csv_reader = csv.reader(open(&quot;sample.csv&quot;, &quot;r&quot;), delimiter=&quot;,&quot;)<br>for line in csv_reader:<br>    data.append((float(line[0]), datetime.strptime(line[1], &quot;%d-%b-%Y&quot;), line[2]))<br>    if len(data) % BATCH_SIZE == 0:<br>        cursor.executemany(sql, data)<br>        data = []<br>if data:<br>    cursor.executemany(sql, data)<br>connection.commit()</pre><p>I carefully used <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.setinputsizes">setinputsizes()</a> so that python-oracledb’s <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a> call knew how much memory to allocate for each of the three fields / bind variables and didn’t have to do slow re-allocations as more data was parsed. The first column is numeric, so I passed None to use the default type handling. The second bind variable of the call is also None since the default python-oracledb date handling knows the size of dates. The third CSV field is a string so I chose the upper column size.</p><p>The dates are stored in the CSV file as strings “23-Sep-2025”. My code converts them to datetime objects for insertion. Without this explicit conversion, performance can be slow due to the type mismatch with the database type. See my earlier blog <a href="https://medium.com/oracledevs/application-and-database-type-mismatches-slow-down-data-loads-eda6ced82702">Application and database type mismatches slow down data loads</a>.</p><h4>3. Using PyArrow’s CSV loader and passing the DataFrame to executemany()</h4><p>Using PyArrow’s CSV loader has some type handling efficiencies, so performance of this solution was another step better for me.</p><p>Code is like:</p><pre>sql = &quot;insert into mytab (id, dt, name) values (:1, :2, :3)&quot;<br><br>read_options = pyarrow.csv.ReadOptions(<br>    column_names=[&quot;id&quot;, &quot;dt&quot;, &quot;name&quot;],<br>    block_size=BLOCK_SIZE<br>)<br>convert_options = pyarrow.csv.ConvertOptions(<br>    timestamp_parsers=[&quot;%d-%b-%Y&quot;],<br>    column_types={<br>        &quot;id&quot;:pyarrow.int64(),<br>        &quot;dt&quot;:pyarrow.timestamp(&quot;us&quot;),<br>        &quot;name&quot;:pyarrow.string()<br>    }<br>)<br>csv_reader = pyarrow.csv.open_csv(<br>    &quot;sample.csv&quot;,<br>    read_options=read_options,<br>    convert_options=convert_options)<br>for df in csv_reader:<br>    if df is None:<br>        break<br>    cursor.executemany(sql, df)<br>connection.commit()</pre><p>For the read options, I set the column names since my CSV file didn’t have a header row.</p><p>When reading in batches, PyArrow takes a buffer size, not a row count. As previously noted, I set this to a size so that all the data was read and inserted in one go.</p><p>Since <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a> can take a DataFrame, I didn&#39;t need to do any conversion of the data to a list. This resulted in compact code iterating over the data.</p><h4>4. Using Python’s csv module to read into a list, and direct_path_load() to insert</h4><p>Now the fun begins: Using the standard Python csv module to construct a list, and then passing that to the new <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> method is getting speedy.</p><p>Code is like:</p><pre>data = []<br>csv_reader = csv.reader(open(&quot;sample.csv&quot;, &quot;r&quot;), delimiter=&quot;,&quot;)<br>for line in csv_reader:<br>    data.append((float(line[0]), datetime.strptime(line[1], &quot;%d-%b-%Y&quot;), line[2]))<br>    if len(data) % BATCH_SIZE == 0:<br>        connection.direct_path_load(<br>            schema_name=&quot;HR&quot;,<br>            table_name=&quot;mytab&quot;,<br>            column_names=[&quot;id&quot;, &quot;dt&quot;, &quot;name&quot;],<br>            data=data)<br>        data = []<br>if data:<br>    connection.direct_path_load(<br>        schema_name=&quot;HR&quot;,<br>        table_name=&quot;mytab&quot;,<br>        column_names=[&quot;id&quot;, &quot;dt&quot;, &quot;name&quot;],<br>        data=data)</pre><p>The code is very similar to solution #2, but calls <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> instead of <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a>.</p><p>There is no explicit commit() call here, since <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> will commit the data.</p><h4>5. Using PyArrow’s CSV loader and passing the DataFrame to direct_path_load()</h4><p>The new best-practice solution takes advantage of both the PyArrow CSV loader to read into a DataFrame format, and then passing this DataFrame to <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a>.</p><p>Code is like:</p><pre>read_options = pyarrow.csv.ReadOptions(<br>    column_names=[&quot;id&quot;, &quot;dt&quot;, &quot;name&quot;]<br>    block_size=BLOCK_SIZE<br>)<br>convert_options = pyarrow.csv.ConvertOptions(<br>    timestamp_parsers=[&quot;%d-%b-%Y&quot;],<br>    column_types={<br>        &quot;id&quot;: pyarrow.int64(),<br>        &quot;dt&quot;: pyarrow.timestamp(&quot;us&quot;),<br>        &quot;name&quot;: pyarrow.string()<br>    }<br>)<br>csv_reader = pyarrow.csv.open_csv(<br>    &quot;sample.csv&quot;, read_options=read_options, convert_options=convert_options)<br>for df in csv_reader:<br>    if df is None:<br>        break<br>    connection.direct_path_load(<br>        schema_name=&quot;HR&quot;,<br>        table_name=&quot;mytab&quot;,<br>        column_names=[&quot;id&quot;, &quot;dt&quot;, &quot;name&quot;],<br>        data=df)</pre><p>The code is similar to solution #3, but calls <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> instead of <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a>.</p><p>For the read options, I set the column names since my CSV file didn’t have a header row.</p><p>When reading in batches, PyArrow takes a buffer size, not a row count. As previously noted, I set this to the maximum size so that all the data was read and inserted in one go.</p><p>Since <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> can take a DataFrame, I didn&#39;t need to do any conversion of the data to a list. This resulted in compact code iterating over the data.</p><p>There is no explicit commit() call here, since <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> will commit the data.</p><h3>Results in a Picture</h3><p>Here are my results. Less is better. The numbers are averages over a few runs. In each group, the bars represent the solutions in order described above.</p><figure><img alt="A graph of 3 groups of 5 bars. Each bar is the time for one of the solutions in the blog. The 3 groups are for 100,000 rows, for 1,000,000 rows, and for 2,000,000 rows. In each group, the bars decrease is size from left to right. The left bar in each group is for solution #1. The right bar in each group is for solution #5. For the group with 2,000,000 rows the 5 bars are labelled with the numbers of seconds it took the solution to load data. The numbers are : 49.3, 35.1, 29.7, 19.1, and 11.8." src="https://cdn-images-1.medium.com/max/1024/1*OzPTsw8hi2tznAl2famJbg.png" /></figure><p>For the three file sizes, loading with Pandas (solution #1) was slowest, while loading with PyArrow’s csv module and calling <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> (solution #5) was the fastest. The benefit increased as the data size increased. <strong>For my 2,000,000 row file, Direct Path Loading was 4 times faster than using Pandas, and 3 times faster than solution #2, our old recommendation.</strong></p><p>On the numbers themselves, I am using an x86_64 version 23 database, emulating the architecture on an arm64 Mac, so it is inherently slow. Also I was running Python on the same Mac. Your results will vary for all the normal reasons, so do your own benchmarking. Reasons include how slow your machine is, the network speed, how much other work your database is doing, how much data you are loading, and what data types are involved. When testing, you may also want to measure the impact on the database itself. Even if the elapsed time for using <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> is slower than <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a> for a particular data set (e.g a small one), there may still be a reduced impact on the database, which may improve overall system performance.</p><h3>Summary</h3><p>The fastest way to load very large datasets into Oracle Database with Python is to use <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a>. When loading a CSV file, you can also take advantage of PyArrow CSV methods to read into a DataFrame format. Direct Path Loading performance will vary with many factors, including the data types involved. Direct Path Loads are fast but have a few database restrictions, so review the database documentation before assuming you can magically drop them into an existing <a href="https://en.wikipedia.org/wiki/Extract%2C_transform%2C_load">ETL</a> pipeline. Unlike <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a>, there is currently no way to filter noisy data, so make sure your data is clean before trying to load it.</p><p>When evaluating which Oracle Database data loading solution to use, don’t forget to check out Oracle’s specialized <a href="https://docs.oracle.com/en/database/oracle/oracle-database/23/sutil/oracle-sql-loader.html">SQL*Loader</a> tool, and also <a href="https://docs.oracle.com/en/database/oracle/oracle-database/23/sutil/oracle-external-tables.html">External Tables</a>.</p><p>Let us know what you think of Direct Path Loading. If you want to try (or improve) the five solutions shown in this blog, the code is <a href="https://gist.github.com/cjbj/8c6a3e2646bccfcf2bd8d41f64974b1f">here</a>.</p><p>Thanks for using python-oracledb.</p><h3>Python-oracledb Resources</h3><p>Python-oracledb is an open source package for the Python Database API specification with many additions to support advanced Oracle Database features. By default, it is a ‘Thin’ driver that is immediately usable without needing any additional install e.g. no Instant Client is required. Python-oracledb is used by frameworks, ORMs, SQL generation libraries, and other projects.</p><ul><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html">Installation instructions</a></li><li><a href="https://python-oracledb.readthedocs.io/en/latest/index.html">Documentation</a></li><li><a href="https://github.com/oracle/python-oracledb/discussions">Discussions</a></li><li><a href="https://github.com/oracle/python-oracledb">Source code on GitHub</a></li><li><a href="https://pypi.org/project/oracledb/">PyPI</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=c681fb60384f" width="1" height="1" alt=""><hr><p><a href="https://medium.com/oracledevs/direct-path-loads-fast-data-ingestion-with-python-and-oracle-database-c681fb60384f">Direct Path Loads: Fast data ingestion with Python and Oracle Database</a> was originally published in <a href="https://medium.com/oracledevs">Oracle Developers</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[python-oracledb 3.4 introduces Direct Path Loads for rapid bulk data insertion]]></title>
            <link>https://medium.com/oracledevs/python-oracledb-3-4-introduces-direct-path-loads-for-rapid-bulk-data-insertion-da38990d799a?source=rss-adc937c3a9d------2</link>
            <guid isPermaLink="false">https://medium.com/p/da38990d799a</guid>
            <category><![CDATA[etl]]></category>
            <category><![CDATA[database]]></category>
            <category><![CDATA[releases]]></category>
            <category><![CDATA[oracle-database]]></category>
            <category><![CDATA[python]]></category>
            <dc:creator><![CDATA[Christopher Jones]]></dc:creator>
            <pubDate>Tue, 07 Oct 2025 04:21:12 GMT</pubDate>
            <atom:updated>2026-03-12T22:55:29.004Z</atom:updated>
            <content:encoded><![CDATA[<h4>Another great release of python-oracledb supports your use of Python and Oracle Database for <a href="https://en.wikipedia.org/wiki/Extract%2C_transform%2C_load">ETL</a>, Data Analysis, and AI.</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*5NIsjCsgAmzoyl8E" /><figcaption>Photo by <a href="https://unsplash.com/@chrispanas?utm_source=medium&amp;utm_medium=referral">chris panas</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>The python-oracledb 3.4 driver is now available on <a href="https://pypi.org/project/oracledb/">PyPI</a>.</p><h4><strong>Direct Path Loading</strong></h4><p>The marquee feature in <a href="https://pypi.org/project/oracledb/">python-oracledb</a> 3.4 is <a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/batch_statement.html#direct-path-loads">Direct Path Loading</a>, something I’ve really wanted to support for a long time.</p><p>Direct Path Loading is a database feature most commonly known from SQL*Loader, but also exposed by JDBC, Oracle Call Interface, and ODP.NET — and now by python-oracledb. It allows data being inserted into an Oracle Database table to bypass code layers such as the database buffer cache. Also there are no INSERT statements used. This allows very fast ingestion of huge amounts of data.</p><p>In python-oracledb 3.4, a simple API <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/connection.html#oracledb.Connection.direct_path_load">direct_path_load()</a> lets you pass in a list of sequences or a DataFrame to be loaded into a table. For example with data in a sequence:</p><pre>DATA = [<br>    (1, &quot;First row&quot;),<br>    (2, &quot;Second row&quot;),<br>    (3, &quot;Third row&quot;),<br>]<br><br>connection.direct_path_load(<br>    schema_name=&quot;HR&quot;,<br>    table_name=&quot;mytab&quot;,<br>    column_names=[&quot;id&quot;, &quot;name&quot;]<br>    data=DATA<br>)</pre><p>If you are using Python to load large amounts of data into Oracle Database you will want to check out this new feature. See my companion blog <a href="https://cjones-oracle.medium.com/direct-path-loads-fast-data-ingestion-with-python-and-oracle-database-c681fb60384f">Direct Path Loads: Fast data ingestion with Python and Oracle Database</a> for more information.</p><p>Python-oracledb documentation on Direct Path Loads is <a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/batch_statement.html#direct-path-loads">here</a>.</p><h4><strong>DataFrames are “production”</strong></h4><p>We have removed the “pre-release” status for DataFrame features. My thanks to all our users and early adopters for their inputs into the design and functionality over the last few releases.</p><h4><strong>DataFrame Type Mapping</strong></h4><p>We now allow explicit “type mapping” when querying into a DataFrame, letting you choose the types that your data frames will use. The following will create a Dataframe containing an int16 and a string, with names “col_1” and “col_2” respectively.</p><pre>schema = pyarrow.schema([<br>    (&quot;col_1&quot;, pyarrow.int16()),<br>    (&quot;col_2&quot;, pyarrow.string())<br>])<br><br>odf = connection.fetch_df_all(<br>    &quot;select 456 c1, &#39;King&#39; c2 from dual&quot;,<br>    requested_schema=schema<br>)<br>tab = pyarrow.table(odf)<br>print(tab)</pre><p>One use for this is to reduce the memory requirements for numbers which have a known, small value range, since you can now specify a smaller numeric type than the default type mapping uses. This can also help performance.</p><p>See the documentation <a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/dataframes.html#explicit-data-frame-type-mapping">Explicit Data Frame Type Mapping</a> for more information.</p><h4>DataFrame Chunk Support</h4><p>We now support multiple chunks when ingesting DataFrames.</p><h4>DataFrame Type Support</h4><p>Additional data types are now supported in DataFrames. We added support for all of the signed and unsigned fixed width integer types when ingesting data frames supporting the Arrow PyCapsule interface into Oracle Database. Previously only int64 was supported. Also added was support for types date32 and date64.</p><p>When querying string and binary data into a DataFrame, we now default to the Apache Arrow “LARGE_STRING” and “LARGE_BINARY” types. These support 64-bit offsets, making it more convenient to work with large data sets. If saving 4 bytes per record is important, you can use explicit type mapping to request STRING or BINARY types.</p><h4><strong>Easy Batch Size Control with </strong><a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany"><strong>executemany()</strong></a></h4><p>The <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a> method commonly used for batch INSERT and UPDATE statements now accepts a batch_size parameter. This is used to split the processing of the supplied data into chunks of that number of rows. This allows you to more easily tune the performance of batch inserts without having to explicitly loop and make multiple calls to <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a>.</p><h4><strong>Fine-grained control over LOB and number handling</strong></h4><p>We added fetch_lobs and fetch_decimals parameters where applicable to the methods used for fetching rows or DataFrames. These behave in the same way as the <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/defaults.html#oracledb.Defaults.fetch_lobs">oracledb.defaults.fetch_lobs</a> and <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/defaults.html#oracledb.Defaults.fetch_decimals">oracledb.defaults.fetch_decimals</a> attributes, but give you more control over which routines in your code return data in the desired format.</p><h4><strong>New Optional Install Dependencies</strong></h4><p>For users of <a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/connection_handling.html#configurationproviders">Centralized Configuration Provider</a> support and <a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/connection_handling.html#cloudnativeauthoci">Cloud Native Authentication</a>, we added optional install dependencies [oci_config], [azure_config], [oci_auth] and [azure_auth] to simplify installation of required dependencies. For example, to get set up to use OCI Cloud Native Authentication, you can now do:</p><pre>python -m pip install oracledb[oci_auth]</pre><h4><strong>Deprecation Warnings</strong></h4><p>We’re giving notice that we will necessarily need to de-support old stuff at some future time.</p><p>Eventually we will have to stop building packages for macOS Intel and Windows 32-bit because the Python <a href="https://pypi.org/project/cryptography/">cryptography</a> package we require has just <a href="https://mail.python.org/archives/list/python-announce-list@python.org/thread/R4BZNC36MSFLKULA74KILLFY6GP3VCPA/">announced its deprecation</a> of these architectures. Their timeline puts an upper limit on how long we can continue to produce packages for these architectures. We may need to drop support a bit earlier, depending on how the python-oracledb major release schedule aligns with the cryptography package’s.</p><p>We’re also giving notice that as time passes and new versions of Oracle Database and Oracle Client are released, we will eventually need to take very old versions out of our test plan and connectivity / interoperability support. Over the last 18 years we’ve been adding support for new versions as they have come out: Oracle Database 11g features were added to cx_Oracle/python-oracledb back in October 2007. Oracle Database version 12c support was added in May 2014. Those are ancient dates. These older database and client software packages have themselves been in <a href="https://support.oracle.com/epmos/faces/DocumentDisplay?id=742060.1">“Upgrade Support”</a>-only status from Oracle Support for a number of years. Now that Oracle Database is at version 23, we’re nudging you to upgrade. We haven’t set any specific timeline for python-oracledb’s desupport of the old versions, but be warned it will have to happen.</p><h4>Other changes</h4><p>For all the other improvements and bug fixes in python-oracledb 3.4, see the <a href="https://python-oracledb.readthedocs.io/en/latest/release_notes.html">Release Notes</a>.</p><p>Thank you for using python-oracledb !</p><h3>Python-oracledb Resources</h3><p>Python-oracledb is an open source package for the Python Database API specification with many additions to support advanced Oracle Database features. By default, it is a ‘Thin’ driver that is immediately usable without needing any additional install e.g. no Instant Client is required. Python-oracledb is used by frameworks, ORMs, SQL generation libraries, and other projects.</p><ul><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html">Installation instructions</a></li><li><a href="https://python-oracledb.readthedocs.io/en/latest/index.html">Documentation</a></li><li><a href="https://github.com/oracle/python-oracledb/discussions">Discussions</a></li><li><a href="https://github.com/oracle/python-oracledb">Source code on GitHub</a></li><li><a href="https://pypi.org/project/oracledb/">PyPI</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=da38990d799a" width="1" height="1" alt=""><hr><p><a href="https://medium.com/oracledevs/python-oracledb-3-4-introduces-direct-path-loads-for-rapid-bulk-data-insertion-da38990d799a">python-oracledb 3.4 introduces Direct Path Loads for rapid bulk data insertion</a> was originally published in <a href="https://medium.com/oracledevs">Oracle Developers</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Python Data Classes make it easy to fetch database rows as objects]]></title>
            <link>https://medium.com/oracledevs/python-data-classes-make-it-easy-to-fetch-database-rows-as-objects-8c3ac5932d77?source=rss-adc937c3a9d------2</link>
            <guid isPermaLink="false">https://medium.com/p/8c3ac5932d77</guid>
            <category><![CDATA[python]]></category>
            <category><![CDATA[developer]]></category>
            <category><![CDATA[data-representation]]></category>
            <category><![CDATA[oracle-database]]></category>
            <category><![CDATA[database]]></category>
            <dc:creator><![CDATA[Christopher Jones]]></dc:creator>
            <pubDate>Mon, 22 Sep 2025 22:49:50 GMT</pubDate>
            <atom:updated>2025-09-23T00:14:51.991Z</atom:updated>
            <content:encoded><![CDATA[<p>Python-oracledb <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.rowfactory">rowfactories</a> are a powerful way for Oracle Database queries to alter the representation of fetched rows, reducing the amount of application boilerplate code and data copying. This blog shows how easy it is to use a <a href="https://docs.python.org/3/library/dataclasses.html">Python Data Class</a> with a rowfactory to transform rows into instances of a user-defined class.</p><figure><img alt="A picture of two conveyer belts in an industrial building, playing on the theme of factory transformations." src="https://cdn-images-1.medium.com/max/1024/0*JSB6JPKk2yqzYV5r" /><figcaption>Photo by <a href="https://unsplash.com/@tama66?utm_source=medium&amp;utm_medium=referral">Peter Herrmann</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>A rowfactory is a method that is invoked for each row fetched from the database before it is returned to the application. It can be set on a cursor after statement execution, before data is fetched.</p><p>Consider this code which <strong>does not</strong> use a rowfactory:</p><pre>cursor.execute(<br>    &quot;&quot;&quot;select employee_id, last_name, hire_date<br>       from employees<br>       where employee_id &lt; 103<br>       order by employee_id&quot;&quot;&quot;)<br><br>for row in cursor:<br>    print(row)</pre><p>It simply prints tuples:</p><pre>(100, &#39;King&#39;, datetime.datetime(2003, 6, 17, 0, 0))<br>(101, &#39;Kochhar&#39;, datetime.datetime(2005, 9, 21, 0, 0))<br>(102, &#39;De Haan&#39;, datetime.datetime(2001, 1, 13, 0, 0))</pre><p>Since my goal is to get rows as objects, I can create a class for the three fields and add a <a href="https://docs.python.org/3/library/dataclasses.html">dataclass decorator</a> so it can be used as the cursor rowfactory. The full script is:</p><pre># dc.py - Data Classes and Rowfactories<br><br>import getpass<br>import platform<br>import dataclasses<br>import datetime<br><br>import oracledb<br><br>un = &#39;cj&#39;<br>cs = &#39;localhost/orclpdb1&#39;<br>pw = getpass.getpass(f&#39;Enter password for {un}@{cs}: &#39;)<br><br>@dataclasses.dataclass<br>class MyRow:<br>    employee_id: int<br>    last_name: str<br>    hire_date: datetime.datetime<br><br>connection = oracledb.connect(user=un, password=pw, dsn=cs)<br>cursor = connection.cursor()<br><br>cursor.execute(<br>    &quot;&quot;&quot;select employee_id, last_name, hire_date<br>       from employees<br>       where employee_id &lt; 103<br>       order by employee_id&quot;&quot;&quot;)<br><br>cursor.rowfactory = MyRow<br><br>for row in cursor:<br>    print(row)</pre><p>the output is now:</p><pre>MyRow(employee_id=100, last_name=&#39;King&#39;, hire_date=datetime.datetime(2003, 6, 17, 0, 0))<br>MyRow(employee_id=101, last_name=&#39;Kochhar&#39;, hire_date=datetime.datetime(2005, 9, 21, 0, 0))<br>MyRow(employee_id=102, last_name=&#39;De Haan&#39;, hire_date=datetime.datetime(2001, 1, 13, 0, 0))</pre><p>Each row has been returned as a MyRow object from which you can access the data fields as normal. For example, if you change the loop to:</p><pre>for row in cursor:<br>    print(&quot;Number:&quot;, row.employee_id)<br>    print(&quot;Name:&quot;, row.last_name)<br>    print(&quot;Hire Date:&quot;, row.hire_date)</pre><p>the output is:</p><pre>Number: 100<br>Name: King<br>Hire Date: 2003-06-17 00:00:00<br>Number: 101<br>Name: Kochhar<br>Hire Date: 2005-09-21 00:00:00<br>Number: 102<br>Name: De Haan<br>Hire Date: 2001-01-13 00:00:00</pre><p>It’s simple and easy!</p><h4>Notes</h4><p>Any time execute() is called, any existing rowfactory on the cursor is cleared, so if you re-execute a statement, remember to set cursor.rowfactory again.</p><p>For other ways to change data with rowfactories, for example to return rows as dictionaries, see the python-oracledb documentation <a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/sql_execution.html#changing-query-results-with-rowfactories">Changing Query Results with Rowfactories</a>.</p><h3>Python-oracledb Resources</h3><p>Python-oracledb is an open source package for the Python Database API specification with many additions to support advanced Oracle Database features. By default, it is a ‘Thin’ driver that is immediately usable without needing any additional install e.g. no Instant Client is required. Python-oracledb is used by frameworks, ORMs, SQL generation libraries, and other projects.</p><ul><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html">Installation instructions</a></li><li><a href="https://python-oracledb.readthedocs.io/en/latest/index.html">Documentation</a></li><li><a href="https://github.com/oracle/python-oracledb/discussions">Discussions</a></li><li><a href="https://github.com/oracle/python-oracledb">Source code on GitHub</a></li><li><a href="https://pypi.org/project/oracledb/">PyPI</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=8c3ac5932d77" width="1" height="1" alt=""><hr><p><a href="https://medium.com/oracledevs/python-data-classes-make-it-easy-to-fetch-database-rows-as-objects-8c3ac5932d77">Python Data Classes make it easy to fetch database rows as objects</a> was originally published in <a href="https://medium.com/oracledevs">Oracle Developers</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Come and hear all about python-oracledb at Oracle AI World in Las Vegas]]></title>
            <link>https://cjones-oracle.medium.com/come-and-hear-all-about-python-oracledb-at-oracle-ai-world-in-las-vegas-418c76869654?source=rss-adc937c3a9d------2</link>
            <guid isPermaLink="false">https://medium.com/p/418c76869654</guid>
            <category><![CDATA[python]]></category>
            <category><![CDATA[conference]]></category>
            <category><![CDATA[oracle]]></category>
            <category><![CDATA[las-vegas]]></category>
            <category><![CDATA[development]]></category>
            <dc:creator><![CDATA[Christopher Jones]]></dc:creator>
            <pubDate>Mon, 22 Sep 2025 00:07:57 GMT</pubDate>
            <atom:updated>2025-09-22T00:07:57.018Z</atom:updated>
            <content:encoded><![CDATA[<p>If you are going to <a href="https://www.oracle.com/ai-world/">Oracle AI World</a> in Las Vegas next month, a must-see is the session <a href="https://reg.rf.oracle.com/flow/oracle/ocw25/catalog/page/catalog/session/1748638234591001SCed"><strong>Python-oracledb: Advanced Integration for Oracle Database and Python [LRN2902]</strong></a><strong> </strong>on Thursday, 16th October 2025.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Vz6SdtVCoUtwHxCmj9becg.png" /></figure><h4>The Talk</h4><p><strong>Anthony Tuininga</strong>, the widely respected creator and lead developer of python-oracledb is presenting <a href="https://reg.rf.oracle.com/flow/oracle/ocw25/catalog/page/catalog/session/1748638234591001SCed">Python-oracledb: Advanced Integration for Oracle Database and Python [LRN2902]</a><strong> </strong>at 10:15am on Thursday, 16th October 2025. <strong>Hear all about the great new features in python-oracledb</strong> (including some cool features that are so new they aren’t in the abstract). It’s a great opportunity to ask questions, and to connect with us.</p><p>You may also be interested in <a href="https://reg.rf.oracle.com/flow/oracle/ocw25/catalog/page/catalog/session/1748642180501001K27H">Create and Implement Multicloud Applications for Any Runtime and Hyperscaler [LRN2914]</a>, which will cover “centralized configuration providers” in more detail. Our development VP, Srinath Krishnaswamy, will be giving the python-oracledb section of this talk.</p><h4><strong>The Booth</strong></h4><p>Yes there will be a demo booth. Details will be forthcoming.</p><h4>Python-oracledb Resources</h4><p>Python-oracledb is an open source package for the Python Database API specification with many additions to support advanced Oracle Database features. By default, it is a ‘Thin’ driver that is immediately usable without needing any additional install e.g. no Instant Client is required. Python-oracledb is used by frameworks, ORMs, SQL generation libraries, and other projects.</p><ul><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html">Installation instructions</a></li><li><a href="https://python-oracledb.readthedocs.io/en/latest/index.html">Documentation</a></li><li><a href="https://github.com/oracle/python-oracledb/discussions">Discussions</a></li><li><a href="https://github.com/oracle/python-oracledb">Source code on GitHub</a></li><li><a href="https://pypi.org/project/oracledb/">PyPI</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=418c76869654" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Application and database type mismatches slow down data loads]]></title>
            <link>https://medium.com/oracledevs/application-and-database-type-mismatches-slow-down-data-loads-eda6ced82702?source=rss-adc937c3a9d------2</link>
            <guid isPermaLink="false">https://medium.com/p/eda6ced82702</guid>
            <category><![CDATA[etl]]></category>
            <category><![CDATA[performance]]></category>
            <category><![CDATA[database]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[python-oracledb]]></category>
            <dc:creator><![CDATA[Christopher Jones]]></dc:creator>
            <pubDate>Fri, 19 Sep 2025 22:29:06 GMT</pubDate>
            <atom:updated>2025-10-08T23:21:20.837Z</atom:updated>
            <content:encoded><![CDATA[<h4>Back in <a href="https://medium.com/@cjones-oracle/round-trips-are-only-part-of-the-data-insertion-problem-6981e90d0884">Round-trips are only part of the data insertion problem</a>, I linked to a user case study where it was the optimizer, not the cost of round-trips, that slowed loading data into Oracle Database. The same problem bit me recently. Let me show you my example and the solution.</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*nm-YR0jIjxYfKo19" /><figcaption>Photo by <a href="https://unsplash.com/@timgarifov?utm_source=medium&amp;utm_medium=referral">Timur Garifov</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>The scenario is using Python to load a 500,000 line CSV file “sample.csv” like:</p><pre>1,&quot;18-Sep-2025&quot;,&quot;String for row 1&quot;<br>2,&quot;18-Sep-2025&quot;,&quot;String for row 2&quot;<br>3,&quot;18-Sep-2025&quot;,&quot;String for row 3&quot;<br>. . .<br>500000,&quot;18-Sep-2025&quot;,&quot;String for row 500000&quot;</pre><p>into an Oracle Database table:</p><pre>create table mytab (id number, dt date, name varchar2(50));</pre><p>Notice that the data file has string formatted dates, as you would expect in CSV files.</p><p>A standard way of loading in python-oracledb is:</p><pre>import csv<br><br>cursor.setinputsizes(None, 20, 50)<br>data = []<br>csv_reader = csv.reader(open(&quot;sample.csv&quot;, &quot;r&quot;), delimiter=&quot;,&quot;)<br>for line in csv_reader:<br>    data.append((line[0], line[1], line[2]))<br>if data:<br>    sql = &quot;insert into mytab (id, dt, name) values (:1, :2, :3)&quot;<br>    cursor.executemany(sql, data)</pre><p>I carefully used <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.setinputsizes">setinputsizes()</a> so that python-oracledb’s <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a> call knew how much memory to allocate for each of the three fields and didn’t have to do slow re-allocations as more data was parsed. The first column is numeric, so I passed None to use the default type handling. The second and third CSV fields are strings so I chose some reasonable sizes.</p><p>In the CSV reader loop, I appended a tuple of each row to a list, which was eventually passed to the efficient <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.executemany">executemany()</a> function.</p><p>Running my example program (which has some additional timing code) gave:</p><pre>Total elapsed time: 29,576 ms<br>500000 rows were inserted</pre><p>My database is an emulated x86_64 architecture on an arm64 Mac, so it is inherently slow. Also I was running Python on the same Mac. However the elapsed time still looked way off what I expected.</p><p>The issue is the date field: there is a type conversion that is slow. Here is an updated variant of the script:</p><pre>import csv<br><br>cursor.setinputsizes(None, None, 50)<br>data = []<br>csv_reader = csv.reader(open(&quot;sample.csv&quot;, &quot;r&quot;), delimiter=&quot;,&quot;)<br>for line in csv_reader:<br>    data.append((float(line[0]), datetime.strptime(line[1], &quot;%d-%b-%Y&quot;), line[2]))<br>if data:<br>    sql = f&quot;insert into mytab (id, dt, name) values (:1, :2, :3)&quot;<br>    cursor.executemany(sql, data)</pre><p>It converts the CSV date strings into datetime objects when constructing the list. The corresponding second field of the <a href="https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#oracledb.Cursor.setinputsizes">setinputsizes()</a> call is now None since the default python-oracledb date handling knows the size of dates.</p><p>This time it ran in:</p><pre>Total elapsed time: 7,568 ms<br>500000 rows were inserted</pre><p>which is a <strong>lot</strong> faster. The number of round-trips was the same, but the bind type mismatch has been solved.</p><p>If you didn’t start by reviewing <a href="https://medium.com/@cjones-oracle/round-trips-are-only-part-of-the-data-insertion-problem-6981e90d0884">Round-trips are only part of the data insertion problem</a> I recommend reading it now. As well as the case study, it has some links to cases where round-trips <em>were</em> the problem, and shows how to measure and improve the load speed.</p><p>In summary, by efficiently saving round-trips, and by making sure the database is efficiently ingesting the data you are sending it, you can accelerate your data loads.</p><p>An addendum: you might know Pandas CVS functionality makes it easy to load into the database - and it has options to automatically parse dates. However, Pandas can be very slow so I generally avoid it for data uploads. An upcoming blog post will show this and give some new best-practice samples which will be even faster than executemany(). [<strong>Update</strong>: that post is <a href="https://medium.com/oracledevs/direct-path-loads-fast-data-ingestion-with-python-and-oracle-database-c681fb60384f">Direct Path Loads: Fast data ingestion with Python and Oracle Database</a>].</p><h3>Python-oracledb Resources</h3><p>Python-oracledb is an open source package for the Python Database API specification with many additions to support advanced Oracle Database features. By default, it is a ‘Thin’ driver that is immediately usable without needing any additional install e.g. no Instant Client is required. Python-oracledb is used by frameworks, ORMs, SQL generation libraries, and other projects.</p><ul><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html">Installation instructions</a></li><li><a href="https://python-oracledb.readthedocs.io/en/latest/index.html">Documentation</a></li><li><a href="https://github.com/oracle/python-oracledb/discussions">Discussions</a></li><li><a href="https://github.com/oracle/python-oracledb">Source code on GitHub</a></li><li><a href="https://pypi.org/project/oracledb/">PyPI</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=eda6ced82702" width="1" height="1" alt=""><hr><p><a href="https://medium.com/oracledevs/application-and-database-type-mismatches-slow-down-data-loads-eda6ced82702">Application and database type mismatches slow down data loads</a> was originally published in <a href="https://medium.com/oracledevs">Oracle Developers</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[New Tcl oratcl fork makes deployment easy]]></title>
            <link>https://cjones-oracle.medium.com/new-tcl-oratcl-fork-makes-deployment-easy-2e7a359943fb?source=rss-adc937c3a9d------2</link>
            <guid isPermaLink="false">https://medium.com/p/2e7a359943fb</guid>
            <category><![CDATA[oracle-database]]></category>
            <category><![CDATA[development]]></category>
            <category><![CDATA[tcl]]></category>
            <category><![CDATA[oratcl]]></category>
            <category><![CDATA[driver]]></category>
            <dc:creator><![CDATA[Christopher Jones]]></dc:creator>
            <pubDate>Thu, 18 Sep 2025 21:39:48 GMT</pubDate>
            <atom:updated>2025-09-18T21:39:48.386Z</atom:updated>
            <content:encoded><![CDATA[<h4>Do you use the Tcl language with Oracle Database? Miguel Bañón has released a fork of the oratcl driver that introduces a loose coupling on the Oracle Client libraries, making it much easier to distribute and deploy.</h4><figure><img alt="A picture of a feather, making a visual pun on ‘tickle’, the pronounciation of the Tcl language" src="https://cdn-images-1.medium.com/max/1024/0*rM-YJzTZFwCu1tvh" /><figcaption>Photo by <a href="https://unsplash.com/@wolfgang_hasselmann?utm_source=medium&amp;utm_medium=referral">Wolfgang Hasselmann</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>You can now locate or update the Oracle Client libraries without having to rebuild oratcl. You can use Oracle Client libraries from <a href="https://www.oracle.com/database/technologies/instant-client.html">Oracle Instant Client</a> packages, from a full Oracle Client installation (such as installed by Oracle’s GUI installer), or from those included in Oracle Database if Tcl is on the same machine as the database.</p><p>The technology behind the oratcl change is <a href="https://github.com/oracle/odpi">ODPI-C</a>, Oracle’s wrapper over the Oracle Client libraries. ODPI-C dynamically loads the Oracle Client at runtime, so as long as you have the libraries in your OS library loading path, oratcl will use them. ODPI-C is used by a number of projects, notably <a href="https://pypi.org/project/oracledb/">python-oracledb</a> and <a href="https://www.npmjs.com/package/oracledb">node-oracledb</a> Thick modes.</p><p>The new orctcl fork is on GitHub <a href="https://github.com/bagnongithub/oratcl">here</a>. Documentation is <a href="https://bagnongithub.github.io/oratcl/">here</a>.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=2e7a359943fb" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Using Python for Data Analysis and AI]]></title>
            <link>https://medium.com/oracledevs/using-python-for-data-analysis-and-ai-3e07e599e27b?source=rss-adc937c3a9d------2</link>
            <guid isPermaLink="false">https://medium.com/p/3e07e599e27b</guid>
            <category><![CDATA[database]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[dataframes]]></category>
            <category><![CDATA[artificial-intelligence]]></category>
            <category><![CDATA[analysis]]></category>
            <dc:creator><![CDATA[Christopher Jones]]></dc:creator>
            <pubDate>Mon, 01 Sep 2025 05:33:36 GMT</pubDate>
            <atom:updated>2025-10-09T01:36:47.101Z</atom:updated>
            <content:encoded><![CDATA[<h4>Are you using Python for data analysis and AI? Did you know the <a href="https://pypi.org/project/oracledb/">python-oracledb driver</a> for Oracle Database can query directly into, and insert from, Python DataFrames? This can be very fast when you want to use packages such as Apache PyArrow, Pandas, Polars, NumPy, Dask, PyTorch, or to write files in Apache Parquet or Delta Lake format.</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*Ctvie9tjLXeXCiDf" /><figcaption>Photo by <a href="https://unsplash.com/@steve_j?utm_source=medium&amp;utm_medium=referral">Steve Johnson</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>Oracle Database is the best repository for your data when doing analytic or AI workloads.</p><p>Here are handy resources showing how simple and efficient it is fetch into DataFrames, and to insert DataFrames directly into Oracle Database using Python.</p><h4>Videos</h4><ul><li><a href="https://youtu.be/0BJNlbh71LY">Python DataFrames with Oracle Database for Analysis and AI</a></li><li><a href="https://youtu.be/lC07FNCzJ5w">Python DataFrames and the Oracle Database 23ai VECTOR Data Type</a></li></ul><h4>Blogs</h4><ul><li><a href="https://cjones-oracle.medium.com/the-best-way-to-fetch-and-insert-python-dataframes-and-tensors-with-oracle-database-70a1b24a3d99">The Best Way to Fetch and Insert Python DataFrames and Tensors with Oracle Database</a></li><li><a href="https://medium.com/oracledevs/direct-path-loads-fast-data-ingestion-with-python-and-oracle-database-c681fb60384f">Direct Path Loads: Fast data ingestion with Python and Oracle Database</a></li><li><a href="https://medium.com/oracledevs/going-10x-faster-with-python-oracledb-data-frames-9fe6ea736697">Going 10x faster with python-oracledb Data Frames</a></li><li><a href="https://levelup.gitconnected.com/writing-to-parquet-and-delta-lake-files-from-oracle-database-using-python-5f7382bfcdc6">Writing to Parquet and Delta Lake files from Oracle Database using Python</a></li><li><a href="https://medium.com/oracledevs/python-oracledb-3-0-data-frames-a-new-way-to-query-data-4139418bef82">python-oracledb 3.0 Data Frames — a new way to query data</a></li><li><a href="https://cjones-oracle.medium.com/python-oracledb-3-3-adds-dataframe-ingestion-and-sessionless-transactions-04948072c438">Python-oracledb 3.3 adds DataFrame ingestion and Sessionless Transactions</a></li></ul><h4>Samples</h4><ul><li>See files beginning with “dataframe” on <a href="https://github.com/oracle/python-oracledb/tree/main/samples">GitHub</a></li></ul><h4>Documentation</h4><ul><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/dataframes.html">Working with Data Frames</a></li><li><a href="https://www.oracle.com/database/technologies/appdev/python/quickstartpythononprem.html">Quick Start: Developing Python Applications for Oracle Database</a></li><li><a href="https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html">Installing python-oracledb</a></li></ul><h4>Background</h4><p>Python-oracledb is an open source package for the Python Database API specification with many additions to support advanced Oracle Database features. By default, it is a ‘Thin’ driver that is immediately usable without needing Oracle Instant Client libraries. Install it with python -m pip install oracledb</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=3e07e599e27b" width="1" height="1" alt=""><hr><p><a href="https://medium.com/oracledevs/using-python-for-data-analysis-and-ai-3e07e599e27b">Using Python for Data Analysis and AI</a> was originally published in <a href="https://medium.com/oracledevs">Oracle Developers</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>