Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 17 hours 3 min ago

Java and InfluxDB http api

Tue, 2020-07-21 03:29
InfluxDB

InfluxDB is a powerfull open source time series database (TSDB) developped by InfluxData. It’s a database optimized for time-stamped or time series data. Time series data are simply measurements or events that are tracked.
It is particully interresting for metric tracking like server cpu, ram, application performances and so on.

It is similar to SQL databases but different in many ways. The key here is the time. The database engine is optimized for high ingest and data compression. It is wrtten in Go and compiles in one single binary without any dependencies.
We can use it through a cli but the most interesting part is it’s HTTP API which will allow us to perform actions on the measurements without any third party api.
It can be installed on a single server or in the cloud. It is highly compatible with Telegraf, a time series data collector, but we will discuss it on another blog.

Installation

I did the installation on a CentOS but it can be installed on other linux distributions, OS X and windows. You can download the rpm/zip/tar on the download page or you can install it directly like following:

Add the repo:

cat <<EOF | sudo tee /etc/yum.repos.d/influxdb.repo
[influxdb]
name = InfluxDB Repository - RHEL \$releasever
baseurl = https://repos.influxdata.com/rhel/\$releasever/\$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key
EOF

Ensure the cache is up to date:

sudo yum makecache fast

Install the database, we install curl in order to test the HTTP API:

sudo yum -y install influxdb vim curl

We enable the HTTP API:

sudo vim /etc/influxdb/influxdb.conf
[http]
  enabled = true
  bind-address = ":8086"
  auth-enabled = true
  log-enabled = true
  write-tracing = false
  pprof-enabled = true
  pprof-auth-enabled = true
  debug-pprof-enabled = false
  ping-auth-enabled = true

We start and enable the service:

sudo systemctl start influxdb && sudo systemctl enable influxdb

We need to open the port used by InfluxDB:

sudo firewall-cmd --add-port=8086/tcp --permanent
sudo firewall-cmd --reload

Now we create an admin account:

curl -XPOST "http://localhost:8086/query" --data-urlencode "q=CREATE USER username WITH PASSWORD 'password' WITH ALL PRIVILEGES"

If the query executed correctly you shouldn’t have any response.

You can now test your connection with the cli:

influx -username 'username' -password 'password'

You can list the databases with:

SHOW DATABASES

To create a database do the following:

CREATE DATABASE mymetrics

And to navigate to your created DB:

USE mymetrics

Once you’ll have some entries you will be able to list them like:

SHOW MEASUREMENTS

And then you can execute basic SQL like queries (only when you have measurements):

SELECT * FROM metric_cpu_load

 

Measurements via Java

Now that we have our DB setup and a user to connect, we will generate measurements with Java. But you can do it with any HTTP request compatible languages. It exists a Java api to discuss with the database directly (here) but I wanted to do it through the HTTP API.

You will need httpclient.jar and httpcore.jar from Apache Commons to use the latest http api. The version I used was:

  • httpclient5-5.0.1.jar
  • httpcore5-5.0.1.jar
String data = "metric_cpu_load value=0.5"
// We create a default HTTP client
HttpClient httpclient = HttpClients.createDefault();
// We create the URI with all informations
URI uri = new URIBuilder()
        .setScheme("http") // Can be https if ssl is enabled
        .setHost("localhost") // The hostname of the database
        .setPort(8086) // Default InfluxDB port
        .setPath("/write") // We call write to put data into the database
        .setParameter("db", "mymetrics") // The name of the database we created
        .setParameter("u", "username") // The username of the account we created
        .setParameter("p", "password") // The password of the account
        .build();
HttpPost httppost = new HttpPost(uri); // We create a POST request in order to add data
StringEntity entity = new StringEntity(data, ContentType.create("plain/text")); // The metric name and value is set as the body of the post request
httppost.setEntity(entity); // We link the entity to the post request
httpclient.execute(httppost); // Executes the post request, the result is an HttpResponse object, if the query went well, you should have a code 204

InfluxDB stores measurements in “tables”. One measure type is a table, you can specify several columns in the table, but note that the “value” column is mandatory. Here is the explanation of the body of the post request:

“table_name, col1=value1, col2=value2 value=value0 timestamp”

  • table_name is the name of the metric, if the table doesn’t exist it will be created, if it exists, a new row will be add to this “table”
  • colx are optionnal, they are tags or info that will specify this row, you can even have some rows with theses options while others don’t
  • value is the mandatory field, it must be specified or the post request will fail
  • timestamp is the timestamp of the row, you can specify it or remove it. If not specified, it will be set to the time when the row was added
Example

Here a simple example

We register several values:

[POST] http://localhost:8086/write?db=mymetrics&u=username&p=password
BODY: "metric_cpu_load,hostname=localhost value=0.1"

[POST] http://localhost:8086/write?db=mymetrics&u=username&p=password
BODY: "metric_cpu_load,hostname=localhost  value=0.4"

[POST] http://localhost:8086/write?db=mymetrics&u=username&p=password
BODY: "metric_cpu_load,hostname=localhost  value=0.8"

[POST] http://localhost:8086/write?db=mymetrics&u=username&p=password
BODY: "metric_cpu_load,hostname=localhost  value=0.5"

Or in CURL we create the DB if it doesn’t exist:

curl -i -XPOST 'http://localhost:8086/query?u=username&p=password' --data-urlencode "q=CREATE DATABASE mymetrics" -> returns code 200 OK
curl -i -XPOST 'http://localhost:8086/write?db=mymetrics&u=username&p=password' --data-binary 'metric_cpu_load,hostname=localhost value=0.1' -> returns 204 NO CONTENT
curl -i -XPOST 'http://localhost:8086/write?db=mymetrics&u=username&p=password' --data-binary 'metric_cpu_load,hostname=localhost value=0.4'
curl -i -XPOST 'http://localhost:8086/write?db=mymetrics&u=username&p=password' --data-binary 'metric_cpu_load,hostname=localhost value=0.8'
curl -i -XPOST 'http://localhost:8086/write?db=mymetrics&u=username&p=password' --data-binary 'metric_cpu_load,hostname=localhost value=0.5'

Now you can see the result from the cli:

influx -username 'username' -password 'password'

Connected to http://localhost:8086 version 1.8.1
InfluxDB shell version: 1.8.1

> SHOW DATABASES
name: databases
name
----
_internal
mymetrics

> USE mymetrics
Using database mymetrics

> SHOW MEASUREMENTS
name: measurements
name
----
metric_cpu_load

> SELECT * FROM metric_cpu_load
name: metric_cpu_load
time                hostname  value
----                --------  -----
1595318363624096578 localhost 0.1
1595318430152497136 localhost 0.4
1595318433209434527 localhost 0.8
1595318436384650878 localhost 0.5

We can also get it from HTTP:

http://localhost:8086/query?db=mymetrics&u=username&p=password&q=select * from metric_cpu_load
{"results":
  [{
    "statement_id":0,
    "series":[{
       "name":"metric_cpu_load",
       "columns":["time","hostname","value"],
       "values":[
           ["2020-07-21T07:59:23.624096578Z","localhost",0.1],
           ["2020-07-21T08:00:30.152497136Z","localhost",0.4],
           ["2020-07-21T08:00:33.209434527Z","localhost",0.8],
           ["2020-07-21T08:00:36.384650878Z","localhost",0.5]]}]}]}
Next steps

And now what? We have a time based database where we can list and add measurements through Java and CURL, what can we do with this?

Next time we will see how to display the results in shiny graphs thanks to Grafana.

 

Cet article Java and InfluxDB http api est apparu en premier sur Blog dbi services.

Oracle ASH SQL_PLAN_LINE_ID in adaptive plans

Mon, 2020-07-20 09:13

There are several methods to find out where time is spent in an execution plan of a query running in an Oracle database. Classical methods like SQL Trace and running a formatter tool like tkprof on the raw trace, or newer methods like SQL Monitor (when the Tuning Pack has been licensed) or running a query with the GATHER_PLAN_STATISTICS-hint (or with statistics_level=all set in the session) and then using DBMS_XPLAN.DISPLAY_CURSOR(format=>’ALLSTATS LAST’). However, what I often use is the information in SQL_PLAN_LINE_ID of Active Session History (ASH). I.e. more detailed by sampling every second in the recent past through V$ACTIVE_SESSION_HISTORY or in the AWR-history through DBA_HIST_ACTIVE_SESS_HISTORY. However, you have to be careful when interpreting the ASH SQL_PLAN_LINE_ID in an adaptive plan.

Here’s an example:

REMARK: I artificially slowed down the processing with the method described by Chris Antognini here: https://antognini.ch/2013/12/adaptive-plans-in-active-session-history/

SQL> @test

DEPARTMENT_NAME
------------------------------
Executive
IT
Finance
Purchasing
Shipping
Sales
Administration
Marketing
Human Resources
Public Relations
Accounting

11 rows selected.

Elapsed: 00:00:27.83
SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
SQL_ID  8nuct789bh45m, child number 0
-------------------------------------
 select distinct DEPARTMENT_NAME from DEPARTMENTS d   join EMPLOYEES e
using(DEPARTMENT_ID)   where d.LOCATION_ID like '%0' and e.SALARY>2000
burn_cpu(e.employee_id/e.employee_id/4) = 1

Plan hash value: 1057942366

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |             |      1 |        |     11 |00:00:27.82 |      12 |       |       |          |
|   1 |  HASH UNIQUE        |             |      1 |      1 |     11 |00:00:27.82 |      12 |  1422K|  1422K|61440  (0)|
|*  2 |   HASH JOIN         |             |      1 |      1 |    106 |00:00:27.82 |      12 |  1572K|  1572K| 1592K (0)|
|*  3 |    TABLE ACCESS FULL| DEPARTMENTS |      1 |      1 |     27 |00:00:00.01 |       6 |       |       |          |
|*  4 |    TABLE ACCESS FULL| EMPLOYEES   |      1 |      1 |    107 |00:00:27.82 |       6 |       |       |          |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
   3 - filter(TO_CHAR("D"."LOCATION_ID") LIKE '%0')
   4 - filter(("E"."SALARY">2000 AND "BURN_CPU"("E"."EMPLOYEE_ID"/"E"."EMPLOYEE_ID"/4)=1))

Note
-----
   - this is an adaptive plan


30 rows selected.

So I’d expect to see SQL_PLAN_LINE_ID = 4 in ASH for the 28 seconds of wait time here. But it’s different:

SQL> select sql_plan_line_id, count(*) from v$active_session_history 
   2 where sql_id='8nuct789bh45m'and sql_plan_hash_value=1057942366 group by sql_plan_line_id;

SQL_PLAN_LINE_ID   COUNT(*)
---------------- ----------
               9         28

1 row selected.

The 28 seconds are spent on SQL_PLAN_LINE_ID 9. Why that?

The reason is that ASH provides the SQL_PLAN_LINE_ID in adaptive plans according the whole adaptive plan:

SQL> select * from table(dbms_xplan.display_cursor('8nuct789bh45m',format=>'+ADAPTIVE'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
SQL_ID  8nuct789bh45m, child number 0
-------------------------------------
 select distinct DEPARTMENT_NAME from DEPARTMENTS d   join EMPLOYEES e
using(DEPARTMENT_ID)   where d.LOCATION_ID like '%0' and e.SALARY>2000
burn_cpu(e.employee_id/e.employee_id/4) = 1

Plan hash value: 1057942366

------------------------------------------------------------------------------------------------------
|   Id  | Operation                      | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT               |                   |       |       |     5 (100)|          |
|     1 |  HASH UNIQUE                   |                   |     1 |    30 |     5  (20)| 00:00:01 |
|  *  2 |   HASH JOIN                    |                   |     1 |    30 |     4   (0)| 00:00:01 |
|-    3 |    NESTED LOOPS                |                   |     1 |    30 |     4   (0)| 00:00:01 |
|-    4 |     NESTED LOOPS               |                   |    10 |    30 |     4   (0)| 00:00:01 |
|-    5 |      STATISTICS COLLECTOR      |                   |       |       |            |          |
|  *  6 |       TABLE ACCESS FULL        | DEPARTMENTS       |     1 |    19 |     3   (0)| 00:00:01 |
|- *  7 |      INDEX RANGE SCAN          | EMP_DEPARTMENT_IX |    10 |       |     0   (0)|          |
|- *  8 |     TABLE ACCESS BY INDEX ROWID| EMPLOYEES         |     1 |    11 |     1   (0)| 00:00:01 |
|  *  9 |    TABLE ACCESS FULL           | EMPLOYEES         |     1 |    11 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
   6 - filter(TO_CHAR("D"."LOCATION_ID") LIKE '%0')
   7 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
   8 - filter(("E"."SALARY">2000 AND "BURN_CPU"("E"."EMPLOYEE_ID"/"E"."EMPLOYEE_ID"/4)=1))
   9 - filter(("E"."SALARY">2000 AND "BURN_CPU"("E"."EMPLOYEE_ID"/"E"."EMPLOYEE_ID"/4)=1))

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)


37 rows selected.

So here we see the correct line 9. It’s actually a good idea to use

format=>'+ADAPTIVE'

when gathering plan statistics (through GATHER_PLAN_STATISTICS-hint or STATISTICS_LEVEL=ALL):

SQL> @test

DEPARTMENT_NAME
------------------------------
Executive
IT
Finance
Purchasing
Shipping
Sales
Administration
Marketing
Human Resources
Public Relations
Accounting

11 rows selected.

Elapsed: 00:00:27.96
SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST +ADAPTIVE'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
SQL_ID  8nuct789bh45m, child number 0
-------------------------------------
 select distinct DEPARTMENT_NAME from DEPARTMENTS d   join EMPLOYEES e
using(DEPARTMENT_ID)   where d.LOCATION_ID like '%0' and e.SALARY>2000
burn_cpu(e.employee_id/e.employee_id/4) = 1

Plan hash value: 1057942366

-------------------------------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                      | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT               |                   |      1 |        |     11 |00:00:27.85 |    1216 |       |       |          |
|     1 |  HASH UNIQUE                   |                   |      1 |      1 |     11 |00:00:27.85 |    1216 |  1422K|  1422K|73728  (0)|
|  *  2 |   HASH JOIN                    |                   |      1 |      1 |    106 |00:00:27.85 |    1216 |  1572K|  1572K| 1612K (0)|
|-    3 |    NESTED LOOPS                |                   |      1 |      1 |     27 |00:00:00.01 |       6 |       |       |          |
|-    4 |     NESTED LOOPS               |                   |      1 |     10 |     27 |00:00:00.01 |       6 |       |       |          |
|-    5 |      STATISTICS COLLECTOR      |                   |      1 |        |     27 |00:00:00.01 |       6 |       |       |          |
|  *  6 |       TABLE ACCESS FULL        | DEPARTMENTS       |      1 |      1 |     27 |00:00:00.01 |       6 |       |       |          |
|- *  7 |      INDEX RANGE SCAN          | EMP_DEPARTMENT_IX |      0 |     10 |      0 |00:00:00.01 |       0 |       |       |          |
|- *  8 |     TABLE ACCESS BY INDEX ROWID| EMPLOYEES         |      0 |      1 |      0 |00:00:00.01 |       0 |       |       |          |
|  *  9 |    TABLE ACCESS FULL           | EMPLOYEES         |      1 |      1 |    107 |00:00:27.85 |    1210 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
   6 - filter(TO_CHAR("D"."LOCATION_ID") LIKE '%0')
   7 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
   8 - filter(("E"."SALARY">2000 AND "BURN_CPU"("E"."EMPLOYEE_ID"/"E"."EMPLOYEE_ID"/4)=1))
   9 - filter(("E"."SALARY">2000 AND "BURN_CPU"("E"."EMPLOYEE_ID"/"E"."EMPLOYEE_ID"/4)=1))

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)


37 rows selected.

Summary: Be careful checking SQL_PLAN_LINE_IDs in adaptive plans. This seems obvious in simple plans like the one above, but you may go in wrong directions if there are plans with hundreds of lines. Your analysis may be wrong if you are looking at the wrong plan step. To do some further validation, you may query sql_plan_operation, sql_plan_options from v$active_session_history as well.

Cet article Oracle ASH SQL_PLAN_LINE_ID in adaptive plans est apparu en premier sur Blog dbi services.

DBPod – le podcast Bases de Données

Tue, 2020-07-14 15:20
By Franck Pachot

.
J’essaie quelque chose de nouveau. Je publie beaucoup en anglais (blog, articles, présentations) mais cette fois quelque chose de 100% francophone. En sortant du confinement, on reprend les transports (train, voiture,…) et c’est l’occasion de se détendre en musique mais aussi de s’informer avec des podcasts. J’ai l’impression que c’est un format qui a de l’avenir: moins contraignant que regarder une video ou ou lire un article ou une newsletter. Alors je teste une plateforme 100% gratuite: Anchor (c’est un peu le ‘Medium’ du Podcast).

Ce qui me paraît intéressant sur Anchor, c’est qu’il y a une appli smartphone qui permet aussi de laisser des messages vocaux. Alors n’hésitez pas pour des remarques, ou questions. Vos expériences sur les bases de données peuvent être intéressantes pour d’autres. Cloud, migrations Open Source,… c’est dans l’air et c’est toujours utile de partager. Avec l’urgence sanitaire, on voit moins de meetups alors pourquoi pas essayer d’autres media.

Pour le moment j’ai publié 3 épisodes sur Oracle, les sujets importants pour cette année: les Versions (19c ou 20c), les Release Updates et le Multitenant. Et un épisode plus général sur les bases serverless. Les sujets viendront en fonction de l’actualité et de mon expérience quotidienne en consulting, ou recherche. Mais aussi de vos questions.

Anchoe me parait pratique, mais c’est aussi disponible sur Spotify (“Follow” permet d’être notifié des nouveaux épisodes):

Et plusieurs autres platformes, comme iTunes:
https://podcasts.apple.com/ch/podcast/dbpod-le-podcast-bases-de-donn%C3%A9es/id1520397763?l=fr

Ou RadioPublic:

Breaker:
https://www.breaker.audio/db-pod-le-podcast-bases-de-donnees
Google Podcasts:
https://www.google.com/podcasts?feed=aHR0cHM6Ly9hbmNob3IuZm0vcy8yMDdjNmIyMC9wb2RjYXN0L3Jzcw==
Overcast:
https://overcast.fm/itunes1520397763/db-pod-le-podcast-bases-de-donn-es
Pocket Casts:
https://pca.st/qtv36wbn

Pour ces premiers épisodes, je découvre… L’enregistrement est beaucoup trop rapide: j’ai eu la main un peu lourde sur Audacity qui permet d’enlever facilement les imperfections. Et quand on s’écoute soi-même… on en voit partout des imperferctions! N’hésitez pas à me faire un feedback. Sur le fond (les sujets dont vous souhaitez que je parle) et la forme (ça ne peut que s’améliorer…). Et à partager autour de vous, collègues ou managers. Les sujets seront moins techniques et plus généralistes que mes blogs.

Cet article DBPod – le podcast Bases de Données est apparu en premier sur Blog dbi services.

Documentum – Some things to take care of when migrating/upgrading

Tue, 2020-07-14 14:21

Over the years I have been working with Documentum, I participated in many migrations and/or upgrades. Just like everything else, this obviously builds up some experience but what if you are pretty new to Documentum and it is your first time working on one? Sure, there is the Documentation from OpenText, which is several dozens of pages (~140, depending on versions) of intense fun so prepare a lot of coffee, you will definitively need some. Except for the documentation, there isn’t much information available so I was thinking about writing a rather short blog to list some key things to absolutely look at if you don’t want to lose all your work or worse. This is obviously not going to be an exhaustive list, it’s not really possible, else it would be in the documentation directly.

 

The definition of a migration and an upgrade can be found on the documentation. If these two keywords aren’t clear to you, you should definitively go take a look because it can be very confusing, especially some elements of the definition from OpenText. Alright so here are a few key takeaways:

  • Read the documentation. Yeah, I know it’s annoying and you will have to excuse me but I had to. There is a lot of crucial information in the documentation, so you don’t really have the choice. This will really help.

  • Look at the supported versions early (clients included!). A migration or an upgrade, for Documentum, is a rather complex project and the process that you will need to apply to attain your goal will heavily depend on supported versions. The result will obviously always be the same (the repository is migrated/upgraded) but the path to reach it can be very different. It’s not so easy to define the correct & supported path because there are many possibilities and many things to take into consideration. In addition, there might also be version specific requirements. For example if you are planning to upgrade a Content Server to 16.7 or above, then all DFC Clients (xPlore, DA, D2, …) will also need to be 16.7 or above so this needs to come into the picture.

  • Think about the data availability. In the migration/upgrade process, you could opt to keep the source system intact and perform the upgrade on a staging environment before moving to the target system: for example to move from a VM to Docker/Kubernetes. Technically, you don’t need the data on the staging to do an in-place upgrade of the repository, all you need is the Database (export the DB from the source and import it into the DB of the staging). However, performing the in-place upgrade on the staging will create some objects in the repository and these objects will therefore have their data created on the staging data folder. Therefore, if the target system isn’t the staging one, you will then be missing some data because it stayed in the staging while the data that you have on the target is the data from the source (or a copy of the data from the source). If you are following me, you can obviously just perform an additional copy of the staging upgraded data (it’s mainly jar, txt and xml files created by DARs installation), that’s simple because you can easily find which data were created by the upgrade. The purpose of the staging is usually to have the binaries for the upgrade so the initial data will be almost empty, most probably still using the folder “content_storage_01/<hexa_id>/80/00/0x” with x usually equal to 3/4/5. While the data created by the upgrade process will have a much higher id (the next available id, based on the source DB) like “content_storage_01/<hexa_id>/80/8d/6x” for example. I didn’t test it so that might just be a wild guess but maybe re-installing the DARs would have the same effect? That would be something to test…

  • Think about encryptions. Whether you are using an AEK key or a Lockbox, you will need to take care about the repository encryption (same for data encryption if you have a TCS license). The simpler is usually to copy the original AEK/Lockbox with its associated repository(ies). However, you might not have this luxury, especially if the source is an old version (AEK needs to be upgraded?).

  • Think about passwords (can also be seen as related to the point about encryption). You can change all passwords if you want between the source and the target but that will obviously add an overhead. It’s usually simpler to change passwords (if required) once the migration/upgrade is completed but for that, you will probably want to first copy the AEK/Lockbox. That reminds me of blogs I wrote several years ago about password changes on various Documentum components/layers, like for the Lockbox.

  • Think about the Global Registry Repository. In case of migration, the GR doesn’t really need to be migrated, that is only if the Repository that you want to migrate isn’t the GR, obviously ;). However, there might be some data specific for the normal Repository inside the GR, like D2 Keystore information (D2 4.7+) for example. It’s usually faster to just recreate the needed objects/configurations on the target system, compared to the need to duplicate all efforts to migrate another repo (the GR).

  • Use the same docbase id. This might sound silly but I prefer to say it, to be sure. If your process involves a different target system, then you will need to install an empty repository on it. It’s not strictly mandatory to use the same docbase id because you can always manually create the needed folder structure on the target system (log folders, data folders, aso…) as well as the needed files (server.ini, dbpasswd.txt, start/stop scripts, ServerApps/ACS config for the repository, aso…). However, doing this manually would take a lot of time to make sure nothing is forgotten while creating the repository, in silent at least, just needs one command to be executed and that’s it. Since you are going to create a repository anyway, you might as well just use directly the correct docbase id from the source. Starting with the CS 7.3, if you want to change the docbase id of a Repository, that’s possible. You don’t have to remove it completely, you can just use the “MigrationUtil” utility (See this blog for an example).

  • Make sure there are no locked objects. As far as I remember, the documentation only specify that the dm_server_config objects must not be locked. In reality, you must really check all dm_sysobject. It’s not a problem if some user’s documents are locked since the migration/upgrade process will not update them but Documentum OOTB objects shouldn’t otherwise some steps might fail, even sometimes silently which is then pretty hard to detect. I’m saying this for the dm_ objects but it also applies to other types of objects. For example, if you have a d2_dictionary_archive object locked, then running a D2 Config Import will fail and this is just one of many many cases so better be safe than sorry.

  • Look for version specific steps. I already gave an example above about the CS 16.7+ that needs all DFC clients to be 16.7+ as well but there are other things. For example, if you are trying to upgrade to 16.7+ and if you are using a Lockbox, you will first need to extract the AEK Key from the Lockbox since it has been deprecated (See this blog for guidance).



  • Stay patient and open-minded. Don’t worry, the migration/upgrade will probably fail but that’s OK, you are here to fix it ;).

 

These are just some key points I like to keep in mind. If there are some you would like to share, feel free and I can also add them to the list.

 

Cet article Documentum – Some things to take care of when migrating/upgrading est apparu en premier sur Blog dbi services.

19c: scalable Top-N queries without further hints to the query planner

Tue, 2020-07-14 04:36
By Franck Pachot

.
The FETCH FIRST … ROWS ONLY syntax arrived in Oracle 12c and is much more convenient than using a subquery with ‘ORDER BY’ wrapped in a “WHERE ROWNUM < …” around it. But as I mentioned in a previous post it required the FIRST_ROWS() hint to get correct estimations. In SQL you don’t want to overload your code for performance, right? The RDBMS optimizer does the job for you. This was a bug with this new FETCH FIRST syntax, that is fixed (See Nigel Bayliss post about this) in 19c:


SQL> select bugno,value,optimizer_feature_enable,description from  V$SYSTEM_FIX_CONTROL where bugno=22174392;

      BUGNO    VALUE    OPTIMIZER_FEATURE_ENABLE                                                      DESCRIPTION
___________ ________ ___________________________ ________________________________________________________________
   22174392        1 19.1.0                      first k row optimization for window function rownum predicate

You can use this query on database metadata to check that you have the fix enabled (VALUE=1 means that the bug is ON). Yes, Oracle Database is not open source, but a lot of information is disclosed: you can query, with SQL, all optimizer enhancements and bug fixes, and versions they appear. And you can even enable or disable them at query level:


select /*+ opt_param('_fix_control' '22174392:OFF') */
continentexp,countriesandterritories,cases from covid where cases>0
order by daterep desc, cases desc fetch first 5 rows only

This simulates previous versions where the fix were not there.

Here is an example on the COVID table I’ve created in a previous post:https://blog.dbi-services.com/oracle-select-from-file/
I’m running this on the Oracle Cloud 20c preview but you can run the same on any recent version.

FETCH FIRST n ROWS

I’m looking at the Top-5 countries with the highest covid cases in the latest date I have in my database. This means ORDER BY the date and number of cases (both in descending order) and fetching only the first 5 rows.


SQL> select continentexp,countriesandterritories,cases
  2  from covid where cases>0
  3  order by daterep desc, cases desc fetch first 5 rows only
  4  /


   CONTINENTEXP     COUNTRIESANDTERRITORIES    CASES
_______________ ___________________________ ________
America         United_States_of_America       57258
Asia            India                          28701
America         Brazil                         24831
Africa          South_Africa                   12058
Europe          Russia                          6615


SQL> select * from dbms_xplan.display_cursor(format=>'allstats last')
  2  /


                                                                                                         PLAN_TABLE_OUTPUT
__________________________________________________________________________________________________________________________
SQL_ID  753q1ymf0sv0w, child number 1
-------------------------------------
select continentexp,countriesandterritories,cases from covid where
cases>0 order by daterep desc, cases desc fetch first 5 rows only

Plan hash value: 1833981741

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |       |      1 |        |      5 |00:00:00.01 |     239 |       |       |          |
|*  1 |  VIEW                    |       |      1 |      5 |      5 |00:00:00.01 |     239 |       |       |          |
|*  2 |   WINDOW SORT PUSHED RANK|       |      1 |  20267 |      5 |00:00:00.01 |     239 |   124K|   124K|  110K (0)|
|*  3 |    TABLE ACCESS FULL     | COVID |      1 |  20267 |  18150 |00:00:00.01 |     239 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("DATEREP") DESC ,INTERNAL_FUNCTION("CASES") DESC )0)

I have queried the execution plan because the RDBMS optimization is not a black box: you can ask for an explanation. Here, it reads the whole table (TABLE ACCESS FULL), SORT it, and FILTER the rows up to number 5.

This is not efficient at all. In a relational database, rather than streaming all changes to another data store, we add purpose-built indexes to scale with a new query use-case:


SQL> create index covid_date_cases on covid(daterep,cases)
  2  /
Index COVID_DATE_CASES created.

That’s all. This index will be automatically maintained with strong consistency, and transparently for any modifications to the table.

NOSORT STOPKEY

I’m running exactly the same query:


SQL> select continentexp,countriesandterritories,cases
  2  from covid where cases>0
  3  order by daterep desc, cases desc fetch first 5 rows only
  4  /

   CONTINENTEXP     COUNTRIESANDTERRITORIES    CASES
_______________ ___________________________ ________
America         United_States_of_America       57258
Asia            India                          28701
America         Brazil                         24831
Africa          South_Africa                   12058
Europe          Russia                          6615

SQL> select * from dbms_xplan.display_cursor(format=>'allstats last')
  2  /

                                                                                              PLAN_TABLE_OUTPUT
_______________________________________________________________________________________________________________
SQL_ID  753q1ymf0sv0w, child number 2
-------------------------------------
select continentexp,countriesandterritories,cases from covid where
cases>0 order by daterep desc, cases desc fetch first 5 rows only

Plan hash value: 1929215041

------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name             | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                  |      1 |        |      5 |00:00:00.01 |       7 |
|*  1 |  VIEW                         |                  |      1 |      5 |      5 |00:00:00.01 |       7 |
|*  2 |   WINDOW NOSORT STOPKEY       |                  |      1 |      6 |      5 |00:00:00.01 |       7 |
|   3 |    TABLE ACCESS BY INDEX ROWID| COVID            |      1 |  20267 |      5 |00:00:00.01 |       7 |
|*  4 |     INDEX FULL SCAN DESCENDING| COVID_DATE_CASES |      1 |      6 |      5 |00:00:00.01 |       3 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("DATEREP") DESC,INTERNAL_FUNCTION("CASES") DESC )0)
       filter("CASES">0)

There is a big difference here. I don’t need to read the table anymore. The index provides the entries already sorted and a FULL SCAN DESCENDING will just read the first ones. Look at the ‘Buffer’ column here which is the read units. In the previous test, without the index, it was 239 blocks. But what was bad there is that the FULL TABLE SCAN has a O(N) time complexity. Now with the index, I read only 7 blocks (3 to go down the index B*Tree and 4 to get the remaining column values). And this has a constant time complexity: even with billions of rows in the table this query will read less than 10 blocks, and most of them probably from memory. This will take less than a millisecond whatever the size of the table is.

3 small conclusions here:

  • If you think your RDBMS doesn’t scale, be sure that you run the latest version of it. Mainstream database have constant improvement and bug fixes.
  • You don’t need another database each time you have another use case. Creating indexes can solve many problems and you have full agility when the RDBMS can create them online, and when there’s zero code to change.
  • You don’t need to test on billions of rows. The execution plan operations tell you what scales or not. You don’t need to tell the optimizer how to proceed, but he tells you how he plans and how he did.

Cet article 19c: scalable Top-N queries without further hints to the query planner est apparu en premier sur Blog dbi services.

Terraform Import from AWS

Mon, 2020-07-13 02:35

I came accross an issue by a customer using aws and terraform. He created a server with ebs volume attached to it but he deleted the volume by mistake.

He then recreated it manually without using the terraform to do it. Thus, the tfstate wasn’t up to date with the new volume and when executing the terraform plan it wanted to create a new volume withe the same name and everything.
So you’ll find the different steps in this blog to re-import the existing volume in the existing terraform tfstate.

We will use terraform in command line directly. I suppose that the path to your tfstate is well set in your .tf file, if you have it localy or in s3 doesn’t matter.

For this blog I’ll use the following resource declarations:
– EBS Volume Resource: aws_ebs_volume.inst-name-volume
– EBS Volume ID: vol-id_of_the_volume
– EBS Attachement Resource: aws_volume_attachment.inst-name-attachement
– Ec2 Instance ID: i-id_of_the_aws_instance

If you execute the plan you’ll find which resource it will try to create:

Command

terraform plan -out myplan

If the resource names are the same between the ones already in your tfstate and the ones you want to import, you’ll have to remove it from your tfstate as the import cannot update the current configuration.

Here the command to list all your declared resources:

Command

terraform state list

Here the command to remove the desired resources (ebs volume + attachement):

Command

terraform state rm aws_ebs_volume.inst-name-volume
terraform state rm aws_volume_attachment.inst-name-attachement

You can now import the existing ebs volume and attachement:

Command

terraform import aws_ebs_volume.inst-name-volume vol-id_of_the_volume
terraform import aws_volume_attachment.inst-name-attachement device_name:vol-id_of_the_volume:i-id_of_the_aws_instance
(e.g. terraform import aws_volume_attachment.inst-name-attachement /dev/xvdb:vol-028f89784ffff6294:i-0431fffffff1d2d3d)

The result of the commands should be successful, if so, you can list the declared resources:

Command

terraform state list

And if you execute the plan again, you should see no change, up to date:

Command

terraform plan -out myplan

Note that you can do it for an ec2 instance as well simply with:

Command

terraform import aws_instance.inst-name i-id_of_the_aws_instance

I hope it helped you, if you have any questions do not hesitate in the comments below.

Cet article Terraform Import from AWS est apparu en premier sur Blog dbi services.

The log_duration parameter in PostgreSQL

Fri, 2020-07-10 08:50

log_duration is a useful point for finding slow running queries and to find performance issues also on the applications side using PostgreSQL as database. Another topic is finding issues with Java Applications using Hibernate after a migration to PostgreSQL. Often Hibernate switches from lazy to eager mode and this has massive impact on the application performance.

In this context I will also describe one of the most useful extensions: pg_stat_statements.

The log duration shown in the logs of PostgreSQL includes the client time, so if you have slow queries it can be also issued by network problems.

For demonstaration I will start with creating my own database:

postgres=# create database blog_log_duration;

After creating the database we leave psql with “\q” and with psql log_log_duration as the default; we connect to our newly created database. The parameter log_duration will switch on logging of the duration for everything and than log_min_duration_statement will show the queries when exeeding the specified runtime.

So both paramters settings, log_duration to on and log_min_duration_statement with a value will fill up the logfile, it makes no sense, so we use log_min_duration_statement only.

postgres=# alter system set log_min_duration_statement = 10000;
ALTER SYSTEM

As an example to log queries runing longer than 10s, the parameter setting is in ms.

postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

postgres=#

The new setting becomes active as global setting for all databases on this instance. Quitting the postgres connection with “\q” and connecting to our newly created database with:

psql blog_log_duration;

postgres=#select pg_sleep(11);

In the log file we will see the following:

LOG: duration: 11032.666 ms statement: select pg_sleep(11);

Another possibility is to change these settings for one specific database only, for this we need to revoke the global setting with:

postgres=# alter system reset log_min_duration_statement;
ALTER SYSTEM

And reloading the configuration with:

postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

Checking with:

postgres=# select pg_sleep(11);

… shows no new entry in the log file.

Now we can change this parameter for our test database we created at the beginning:

postgres=# alter database blog_log_duration set log_min_duration_statement = 5000;

Checking for the postgres database:

postgres=# select pg_sleep(11);

… shows again no new entry in the log file.

Settings on database level overruling global settings, just as notice.

Disconnecting from the postgres database with:

postgres=# \q

Connecting to the blog_log_duration database with:

psql blog_log_duration

And checking here with:

blog_log_duration=# select pg_sleep(6);
pg_sleep
----------

(1 row)

blog_log_duration=#

This results in the following line in the log file:

LOG: duration: 6005.967 ms statement: select pg_sleep(6);
Extension pg_stat_statements

This Extension is one of the most useful, it shows what queries have been sent to the database, how often, average, min and max runtime.

For using pg_stat_statements we need to preload libraries, the required configuration can be done with:

postgres=# alter system set shared_preload_libraries = 'pg_stat_statements';

Than a restart is required, a pg_reload_conf(); is not enough.

Then we connect again to our test database with psql blog_log_duration and create the extension with:

blog_log_duration=# create extension pg_stat_statements;
CREATE EXTENSION
blog_log_duration=#

blog_log_duration-# \d pg_stat_statements
View "public.pg_stat_statements"
Column | Type | Collation | Nullable | Default
---------------------+------------------+-----------+----------+---------
userid | oid | | |
dbid | oid | | |
queryid | bigint | | |
query | text | | |
calls | bigint | | |
total_time | double precision | | |
min_time | double precision | | |
max_time | double precision | | |
mean_time | double precision | | |
stddev_time | double precision | | |
rows | bigint | | |
shared_blks_hit | bigint | | |
shared_blks_read | bigint | | |
shared_blks_dirtied | bigint | | |
shared_blks_written | bigint | | |
local_blks_hit | bigint | | |
local_blks_read | bigint | | |
local_blks_dirtied | bigint | | |
local_blks_written | bigint | | |
temp_blks_read | bigint | | |
temp_blks_written | bigint | | |
blk_read_time | double precision | | |
blk_write_time | double precision | | |

blog_log_duration-# ¨

This gives an overview over the view where we can now get the query informations we want to have. This can become very huge, so there are some parameters that can be set to make it more handable.

  • pg_stat_statements.max is setting the maximum number of kept queries in this statistic, 1000 for example means the last 1000 queries kept, 5000 is the default.
  • pg_stat_statements.track specifies the queries tracked
  • pg_stat_statements.track = all tracks nested statements (including statements invoked within functions), none is disabling statement statistics collection and top is tracing top-level statements (those issued directly by clients), the default value is top.
  • pg_stat_statements.save specifies if the statistics are saved across shutdowns or restarts, the default is on.
blog_log_duration=# select pg_stat_statements_reset();
pg_stat_statements_reset
--------------------------

(1 row)

blog_log_duration=#

…is resetting the statistics;

Several runs of select pg_sleep();

 

 

 

 

Both informations are very usefull, showing slow queries is one part, how often they used the other one and this defines the priority where to look at first.

Cet article The log_duration parameter in PostgreSQL est apparu en premier sur Blog dbi services.

Documentum – DARs installation fails Java 11

Wed, 2020-07-08 14:05

As you probably heard if you weren’t in an underground bunker for the past 2 years, Oracle changed the licensing of the Oracle JDK. If you want to read more about it, there is a good FAQ here. This obviously had some important repercussions everywhere and Documentum wasn’t exempted. I would have a lot to say about the way OpenText tried to adapt since Documentum 16.4 (and kind of failed) but that’s not really the topic of this blog. Because of this change on Oracle side, many applications started to support the OpenJDK and move from Java 8 to Java 11 or above. In this blog, I wanted to talk about the issue that you will face if you ever try to install DARs using Java 11.

 

The support for Java 11 has been introduced with Documentum 16.4, either Oracle JDK or OpenJDK. However, by default, the CS 16.4 still included an Oracle JDK 1.8u152 ($DOCUMENTUM/java64/JAVA_LINK) and if you were to try to use Java 11 with the CS 16.4, you would face & see some “funny” things. In a previous blog, I showed how it was possible to deploy DARs using a simple script and I have been using this for years without problem or rather that is until the use of Java 11. In this blog, I will use a Documentum 20.2 environment installed and configured to use an external OpenJDK 11.0.7 (not pre-packaged with the binaries).

 

Running the usual script to install some DARs with Java 11 ends-up with the following:

[dmadmin@cs-0 scripts]$ java -version
openjdk version "11.0.7" 2020-04-14
OpenJDK Runtime Environment 18.9 (build 11.0.7+10)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.7+10, mixed mode)
[dmadmin@cs-0 scripts]$
[dmadmin@cs-0 scripts]$ cat deploy-all.sh
#!/bin/bash
docbases="GR_REPO1"
dar_list="D2-DAR.dar"
username="dmadmin"
dar_location="$DM_HOME/install/DARsInternal"
password="xxx"

old_ifs=${IFS}
for docbase in ${docbases}; do
  IFS=','; for dar in ${dar_list}; do
    dar_name=${dar##*/}

    echo "Deploying ${dar_name} into ${docbase}"
    ts=$(date "+%Y%m%d-%H%M%S")

    $JAVA_HOME/bin/java -Ddar="${dar_location}/${dar}" \
        -Dlogpath="${dar_location}/dar-deploy-${dar_name}-${docbase}-${ts}.log" \
        -Ddocbase=${docbase} -Duser=${username} -Ddomain= -Dpassword="${password}" \
        -cp $DM_HOME/install/composer/ComposerHeadless/startup.jar \
        org.eclipse.core.launcher.Main \
        -data $DM_HOME/install/composer/workspace \
        -application org.eclipse.ant.core.antRunner \
        -buildfile $DM_HOME/install/composer/deploy.xml

    if [[ ${?} != 0 ]]; then
      echo "ERROR - There was an error while installing the DAR '${dar_name}' into '${docbase}'"
      exit 1
    fi
  done
done
IFS=${old_ifs}
[dmadmin@cs-0 scripts]$
[dmadmin@cs-0 scripts]$ . ./deploy-all.sh
Deploying D2-DAR.dar into GR_REPO1
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.eclipse.osgi.internal.baseadaptor.BaseStorage (file:$DM_HOME/install/composer/ComposerHeadless/plugins/org.eclipse.osgi_3.8.1.v20120830-144521.jar) to method java.net.URLClassLoader.addURL(java.net.URL)
WARNING: Please consider reporting this to the maintainers of org.eclipse.osgi.internal.baseadaptor.BaseStorage
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
An error has occurred. See the log file
$DM_HOME/install/composer/ComposerHeadless/configuration/1593386605194.log.
ERROR - There was an error while installing the DAR 'D2-DAR.dar' into 'GR_REPO1'
[dmadmin@cs-0 scripts]$

 

Ignore the “WARNING” messages above, it is something that was added in Java 9 and if you want to read more about it, you can check this post. As you can see, the script fails and give you back a nice log file with 5 000+ lines full of strange messages and errors like:

!ENTRY org.eclipse.equinox.simpleconfigurator 4 0 2020-06-28 23:23:25.599
!MESSAGE FrameworkEvent ERROR
!STACK 0
org.osgi.framework.BundleException: The bundle "org.eclipse.equinox.simpleconfigurator_1.0.301.v20120828-033635 [1]" could not be resolved. Reason: Missing Constraint: Bundle-RequiredExecutionEnvironment: CDC-1.1/Foundation-1.1,J2SE-1.4
        at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolverError(AbstractBundle.java:1332)
        at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolutionFailureException(AbstractBundle.java:1316)
        at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:323)
        at org.eclipse.osgi.framework.internal.core.AbstractBundle.resume(AbstractBundle.java:390)
        at org.eclipse.osgi.framework.internal.core.Framework.resumeBundle(Framework.java:1176)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.resumeBundles(StartLevelManager.java:559)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.resumeBundles(StartLevelManager.java:544)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.incFWSL(StartLevelManager.java:457)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.doSetStartLevel(StartLevelManager.java:243)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.dispatchEvent(StartLevelManager.java:438)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.dispatchEvent(StartLevelManager.java:1)
        at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230)
        at org.eclipse.osgi.framework.eventmgr.EventManager$EventThread.run(EventManager.java:340)

!ENTRY org.eclipse.equinox.common 4 0 2020-06-28 23:23:25.602
!MESSAGE FrameworkEvent ERROR
!STACK 0
org.osgi.framework.BundleException: The bundle "org.eclipse.equinox.common_3.6.100.v20120522-1841 [59]" could not be resolved. Reason: Missing Constraint: Bundle-RequiredExecutionEnvironment: CDC-1.1/Foundation-1.1,J2SE-1.4
        at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolverError(AbstractBundle.java:1332)
        at org.eclipse.osgi.framework.internal.core.AbstractBundle.getResolutionFailureException(AbstractBundle.java:1316)
        at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:323)
        at org.eclipse.osgi.framework.internal.core.AbstractBundle.resume(AbstractBundle.java:390)
        at org.eclipse.osgi.framework.internal.core.Framework.resumeBundle(Framework.java:1176)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.resumeBundles(StartLevelManager.java:559)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.resumeBundles(StartLevelManager.java:544)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.incFWSL(StartLevelManager.java:457)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.doSetStartLevel(StartLevelManager.java:243)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.dispatchEvent(StartLevelManager.java:438)
        at org.eclipse.osgi.framework.internal.core.StartLevelManager.dispatchEvent(StartLevelManager.java:1)
        at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230)
        at org.eclipse.osgi.framework.eventmgr.EventManager$EventThread.run(EventManager.java:340)

...

!ENTRY org.eclipse.osgi 4 0 2020-06-28 23:23:25.895
!MESSAGE Application error
!STACK 1
java.lang.IllegalStateException: Unable to acquire application service. Ensure that the org.eclipse.core.runtime bundle is resolved and started (see config.ini).
        at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:74)
        at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:353)
        at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:180)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:629)
        at org.eclipse.equinox.launcher.Main.basicRun(Main.java:584)
        at org.eclipse.equinox.launcher.Main.run(Main.java:1438)
        at org.eclipse.equinox.launcher.Main.main(Main.java:1414)
        at org.eclipse.core.launcher.Main.main(Main.java:34)

 

What happen is that to deploy DARs, the script is using the Composer Headless. It is a very old software that is present since the creation of the universe (almost) and it doesn’t support Java 11 at all. With Documentum 16.4, since an Oracle JDK 8u152 was shipped with the binaries, it wasn’t an issue to deploy DARs. However, with newer version of the Content Server like 20.2, you can technically install Documentum from scratch using an external Java 11. That being said, since the InstallAnywhere packages (as far as I know) and the Composer Headless do not support Java 11, OpenText included some old JDK 8 here and there to do the needed jobs.

 

In 20.2 for example, you will find an Oracle JDK 8u202 inside the Composer Headless for DAR installation. Documentum uses this whenever a DAR install is needed: Repository, D2, BPM/xCP, aso… You can find the information about the Oracle JDK 8 in the dardeployer.ini configuration file:

[dmadmin@cs-0 scripts]$ cat $DM_HOME/install/composer/ComposerHeadless/dardeployer.ini
-vm
$DM_HOME/install/composer/ComposerHeadless/java/jdk1.8.0_202_x64/bin
-vm
./java/jdk1.8.0_202_x64/bin
-configuration
darinstallerconfiguration
--launcher.XXMaxPermSize
256m
-vmargs
-Xms40m
-Xmx512m
-XX:MaxPermSize=256m
[dmadmin@cs-0 scripts]$

 

What does that mean for the script to deploy DARs? Well, what might be the simplest solution to keep backward compatibility with older versions of Documentum as well would be to use the Java shipped in the Composer Headless, if any, and otherwise use the global JDK referenced in the JAVA_HOME environment variable. The problem is that OpenText keeps changing their packaging and it will probably continue to change in the future, so I believe it’s important to stay as agile and generic as possible. It would be possible to fetch the Java path from the dardeployer.ini file but I don’t know, I have the feeling that it would be safer to just find the file without relying on something put there by the OpenText installers. The future will tell if this was the right call. Anyway, so here would be an updated content of the script to automatically choose the Java and run the command (you can also add an additional check to make sure it is indeed a JDK 8 if you want):

[dmadmin@cs-0 scripts]$ cat deploy-all.sh
#!/bin/bash
docbases="GR_REPO1"
dar_list="D2-DAR.dar"
username="dmadmin"
dar_location="$DM_HOME/install/DARsInternal"
password="xxx"

java=`find $DM_HOME/install/composer/ComposerHeadless -type f -name java | head -1`
if [[ ! -x ${java} ]]; then
  java=$JAVA_HOME/bin/java
fi

old_ifs=${IFS}
for docbase in ${docbases}; do
  IFS=','; for dar in ${dar_list}; do
    dar_name=${dar##*/}

    echo "Deploying ${dar_name} into ${docbase}"
    ts=$(date "+%Y%m%d-%H%M%S")

    ${java} -Ddar="${dar_location}/${dar}" \
        -Dlogpath="${dar_location}/dar-deploy-${dar_name}-${docbase}-${ts}.log" \
        -Ddocbase=${docbase} -Duser=${username} -Ddomain= -Dpassword="${password}" \
        -cp $DM_HOME/install/composer/ComposerHeadless/startup.jar \
        org.eclipse.core.launcher.Main \
        -data $DM_HOME/install/composer/workspace \
        -application org.eclipse.ant.core.antRunner \
        -buildfile $DM_HOME/install/composer/deploy.xml

    if [[ ${?} != 0 ]]; then
      echo "ERROR - There was an error while installing the DAR '${dar_name}' into '${docbase}'"
      exit 1
    fi
  done
done
IFS=${old_ifs}
[dmadmin@cs-0 scripts]$
[dmadmin@cs-0 scripts]$ . ./deploy-all.sh
Deploying D2-DAR.dar into GR_REPO1
Buildfile: $DM_HOME/install/composer/deploy.xml

deploy:
[emc.installer] [INFO]  ******************************************************
[emc.installer] [INFO]  * Headless Composer
[emc.installer] [INFO]  * Version:        20.2.0000.0081
[emc.installer] [INFO]  * Java version:   1.8.0_202 (64bit)
[emc.installer] [INFO]  * Java home:      $DM_HOME/install/composer/ComposerHeadless/java/jdk1.8.0_202_x64/jre
[emc.installer] [INFO]  * Set storage type: false
[emc.installer] [INFO]  *
[emc.installer] [INFO]  * DAR file:       $DM_HOME/install/DARsInternal/D2-DAR.dar
[emc.installer] [INFO]  * Project name:   D2-DAR
[emc.installer] [INFO]  * Built by Composer: 7.1.0000.0186
[emc.installer] [INFO]  *
[emc.installer] [INFO]  * Repository:     GR_REPO1
[emc.installer] [INFO]  * Server version: 20.2.0000.0110  Linux64.Postgres
[emc.installer] [INFO]  * User name:      dmadmin
[emc.installer] [INFO]  ******************************************************
[emc.installer] [INFO]  Install started...  Mon Jun 29 00:18:38 UTC 2020
[emc.installer] [INFO]  Executing pre-install script
[emc.installer] [INFO]  dmbasic.exe output : Connected to the server as dmadmin.
...
...
[emc.installer] [INFO]  dmbasic.exe output : Disconnected from the server.
[emc.installer] [INFO]  Finished executing post-install script Mon Jun 29 00:20:40 UTC 2020
[emc.installer] [INFO]  Project 'D2-DAR' was successfully installed.
BUILD SUCCESSFUL

BUILD SUCCESSFUL
Total time: 2 minutes 15 seconds
[dmadmin@cs-0 scripts]$

 

With this new update of the script, it should work for all versions of Documentum. As mentioned previously, Documentum 20.2 includes the Oracle JDK 8u202 for the Composer Headless. If I’m not mistaken, this is the last free version of the Oracle JDK licensed under the Oracle BCL (Binary Code License). It means that without a Java OTN (Oracle Technology Network) License, you will NOT be able to patch it; it will need to stay in 8u202.

 

Cet article Documentum – DARs installation fails Java 11 est apparu en premier sur Blog dbi services.

The myth of NoSQL (vs. RDBMS) “joins dont scale”

Sun, 2020-07-05 14:45
By Franck Pachot

.
I’ll reference Alex DeBrie article “SQL, NoSQL, and Scale: How DynamoDB scales where relational databases don’t“, especially the paragraph about “Why relational databases don’t scale”. But I want to make clear that my post here is not against this article, but against a very common myth that even precedes NoSQL databases. Actually, I’m taking this article as reference because the author, in his website and book, has really good points about data modeling in NoSQL. And because AWS DynamoDB is probably the strongest NoSQL database today. I’m challenging some widespread assertions and better do it based on good quality content and product.

There are many use cases where NoSQL is a correct alternative, but moving from a relational database system to a key-value store because you heard that “joins don’t scale” is probably not a good reason. You should choose a solution for the features it offers (the problem it solves), and not because you ignore what you current platform is capable of.

time complexity of joins

The idea in this article, taken from the popular Rick Houlihan talk, is that, by joining tables, you read more data. And that it is a CPU intensive operation. And the time complexity of joins is “( O (M + N) ) or worse” according to Alex DeBrie article or “O(log(N))+Nlog(M)” according to Rick Houlihan slide. This makes reference to “time complexity” and “Big O” notation. You can read about it. But this supposes that the cost of a join depends on the size of the tables involved (represented by N and M). That would be right with non-partitioned table full scan. But all relational databases come with B*Tree indexes. And when we compare to a key-value store we are obviously retrieving few rows and the join method will be a Nested Loop with Index Range Scan. This access method is actually not dependent at all on the size of the tables. That’s why relational database are the king of OTLP applications.

Let’s test it

The article claims that “there’s one big problem with relational databases: performance is like a black box. There are a ton of factors that impact how quickly your queries will return.” with an example on a query like: “SELECT * FROM orders JOIN users ON … WHERE user.id = … GROUP BY … LIMIT 20”. It says that “As the size of your tables grow, these operations will get slower and slower.”

I will build those tables, in PostgreSQL here, because that’s my preferred Open Source RDBMS, and show that:

  • Performance is not a black box: all RDBMS have an EXPLAIN command that display exactly the algorithm used (even CPU and memory access) and you can estimate the cost easily from it.
  • Join and Group By here do not depend on the size of the table at all but only the rows selected. You will multiply your tables by several orders of magnitude before the number of rows impact the response time by a 1/10s of millisecond.
Create the tables

I will create small tables here. Because when reading the execution plan, I don’t need large tables to estimate the cost and the response-time scalability. You can run the same with larger tables if you want to see how it scales. And then maybe investigate further the features for big tables (partitioning, parallel query,…). But let’s start with very simple tables without specific optimization.


\timing
create table USERS ( USER_ID bigserial primary key, FIRST_NAME text, LAST_NAME text)
;
CREATE TABLE
create table ORDERS ( ORDER_ID bigserial primary key, ORDER_DATE timestamp, AMOUNT numeric(10,2), USER_ID bigint, DESCRIPTION text)
;

I have created the two tables that I’ll join. Both with an auto-generated primary key for simplicity.

Insert test data

It is important to build the table as they would be in real life. USERS probably come without specific order. ORDERS come by date. This means that ORDERS from one USER are scattered throughout the table. The query would be much faster with clustered data, and each RDBMS has some ways to achieve this, but I want to show the cost of joining two tables here without specific optimisation.


insert into  USERS (FIRST_NAME, LAST_NAME)
 with
  random_words as (

  select generate_series id,
   translate(md5(random()::text),'-0123456789','aeioughij') as word

  from generate_series(1,100)

  )
 select
  words1.word ,words2.word

  from
   random_words words1
  cross join
   random_words words2
  order by words1.id+words2.id

;
INSERT 0 10000
Time: 28.641 ms

select * from USERS order by user_id fetch first 10 rows only;
 user_id |           first_name           |           last_name
---------+--------------------------------+--------------------------------
       1 | iooifgiicuiejiaeduciuccuiogib  | iooifgiicuiejiaeduciuccuiogib
       2 | iooifgiicuiejiaeduciuccuiogib  | dgdfeeiejcohfhjgcoigiedeaubjbg
       3 | dgdfeeiejcohfhjgcoigiedeaubjbg | iooifgiicuiejiaeduciuccuiogib
       4 | ueuedijchudifefoedbuojuoaudec  | iooifgiicuiejiaeduciuccuiogib
       5 | iooifgiicuiejiaeduciuccuiogib  | ueuedijchudifefoedbuojuoaudec
       6 | dgdfeeiejcohfhjgcoigiedeaubjbg | dgdfeeiejcohfhjgcoigiedeaubjbg
       7 | iooifgiicuiejiaeduciuccuiogib  | jbjubjcidcgugubecfeejidhoigdob
       8 | jbjubjcidcgugubecfeejidhoigdob | iooifgiicuiejiaeduciuccuiogib
       9 | ueuedijchudifefoedbuojuoaudec  | dgdfeeiejcohfhjgcoigiedeaubjbg
      10 | dgdfeeiejcohfhjgcoigiedeaubjbg | ueuedijchudifefoedbuojuoaudec
(10 rows)

Time: 0.384 ms

This generated 10000 users in the USERS table with random names.


insert into ORDERS ( ORDER_DATE, AMOUNT, USER_ID, DESCRIPTION)
 with
  random_amounts as (
      ------------> this generates 10 random order amounts
  select
   1e6*random() as AMOUNT
  from generate_series(1,10)
 )
 ,random_dates as (
      ------------> this generates 10 random order dates
  select
   now() - random() * interval '1 year' as ORDER_DATE
  from generate_series(1,10)
 )
select ORDER_DATE, AMOUNT, USER_ID, md5(random()::text) DESCRIPTION from
  random_dates
 cross join
  random_amounts
 cross join
  users
 order by ORDER_DATE
      ------------> I sort this by date because that's what happens in real life. Clustering by users would not be a fair test.
;
INSERT 0 1000000
Time: 4585.767 ms (00:04.586)

I have now one million orders generated as 100 orders for each users in the last year. You can play with the numbers to generate more and see how it scales. This is fast: 5 seconds to generate 1 million orders here, and there are many way to load faster if you feel the need to test on very large data sets. But the point is not there.


create index ORDERS_BY_USER on ORDERS(USER_ID);
CREATE INDEX
Time: 316.322 ms
alter table ORDERS add constraint ORDERS_BY_USER foreign key (USER_ID) references USERS;
ALTER TABLE
Time: 194.505 ms

As, in my data model, I want to navigate from USERS to ORDERS I add an index for it. And I declare the referential integrity to avoid logical corruptions in case there is a bug in my application, or a human mistake during some ad-hoc fixes. As soon as the constraint is declared, I have no need to check this assertion in my code anymore. The constraint also informs the query planner know about this one-to-many relationship as it may open some optimizations. That’s the relational database beauty: we declare those things rather than having to code their implementation.


vacuum analyze;
VACUUM
Time: 429.165 ms

select version();
                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
(1 row)

I’m in PostgreSQL, version 12, on a Linux box with 2 CPUs only. I’ve run a manual VACUUM to get a reproducible testcase rather than relying on autovacuum kicking-in after those table loading.


select relkind,relname,reltuples,relpages,lpad(pg_size_pretty(relpages::bigint*8*1024),20) "relpages * 8k" from pg_class natural join (select oid relnamespace,nspname from pg_namespace) nsp where nspname='public' order by relpages;

 relkind |       relname       | reltuples | relpages |    relpages * 8k
---------+---------------------+-----------+----------+----------------------
 S       | orders_order_id_seq |         1 |        1 |           8192 bytes
 S       | users_user_id_seq   |         1 |        1 |           8192 bytes
 i       | users_pkey          |     10000 |       30 |               240 kB
 r       | users               |     10000 |      122 |               976 kB
 i       | orders_pkey         |     1e+06 |     2745 |                21 MB
 i       | orders_by_user      |     1e+06 |     2749 |                21 MB
 r       | orders              |     1e+06 |    13334 |               104 MB

Here are the sizes of my tables: The 1 million orders take 104MB and the 10 thousand users is 1MB. I’ll increase it later, but I don’t need this to understand the performance predictability.

When understanding the size, remember that the size of the table is only the size of data. Metadata (like column names) are not repeated for each row. They are stored once in the RDBMS catalog describing the table. As PostgreSQL has a native JSON datatype you can also test with a non-relational model here.

Execute the query

select order_month,sum(amount),count(*)
from
  (
  select date_trunc('month',ORDER_DATE) order_month, USER_ID, AMOUNT
  from ORDERS join USERS using(USER_ID)
  ) user_orders
where USER_ID=42
group by order_month
;

     order_month     |     sum     | count
---------------------+-------------+-------
 2019-11-01 00:00:00 |  5013943.57 |    10
 2019-09-01 00:00:00 |  5013943.57 |    10
 2020-04-01 00:00:00 |  5013943.57 |    10
 2019-08-01 00:00:00 | 15041830.71 |    30
 2020-02-01 00:00:00 | 10027887.14 |    20
 2020-06-01 00:00:00 |  5013943.57 |    10
 2019-07-01 00:00:00 |  5013943.57 |    10
(7 rows)

Time: 2.863 ms

Here is the query mentioned in the article: get all orders for one user, and do some aggregation on them. Looking at the time is not really interesting as it depends on the RAM to cache the buffers at database or filesystem level, and disk latency. But we are developers and, by looking at the operations and loops that were executed, we can understand how it scales and where we are in the “time complexity” and “Big O” order of magnitude.

Explain the execution

This is as easy as running the query with EXPLAIN:


explain (analyze,verbose,costs,buffers)
select order_month,sum(amount)
from
  (
  select date_trunc('month',ORDER_DATE) order_month, USER_ID, AMOUNT
  from ORDERS join USERS using(USER_ID)
  ) user_orders
where USER_ID=42
group by order_month
;
                                                                   QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=389.34..390.24 rows=10 width=40) (actual time=0.112..0.136 rows=7 loops=1)
   Output: (date_trunc('month'::text, orders.order_date)), sum(orders.amount)
   Group Key: (date_trunc('month'::text, orders.order_date))
   Buffers: shared hit=19
   ->  Sort  (cost=389.34..389.59 rows=100 width=17) (actual time=0.104..0.110 rows=100 loops=1)
         Output: (date_trunc('month'::text, orders.order_date)), orders.amount
         Sort Key: (date_trunc('month'::text, orders.order_date))
         Sort Method: quicksort  Memory: 32kB
         Buffers: shared hit=19
         ->  Nested Loop  (cost=5.49..386.02 rows=100 width=17) (actual time=0.022..0.086 rows=100 loops=1)
               Output: date_trunc('month'::text, orders.order_date), orders.amount
               Buffers: shared hit=19
               ->  Index Only Scan using users_pkey on public.users  (cost=0.29..4.30 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=1)
                     Output: users.user_id
                     Index Cond: (users.user_id = 42)
                     Heap Fetches: 0
                     Buffers: shared hit=3
               ->  Bitmap Heap Scan on public.orders  (cost=5.20..380.47 rows=100 width=25) (actual time=0.012..0.031 rows=100 loops=1)
                     Output: orders.order_id, orders.order_date, orders.amount, orders.user_id, orders.description
                     Recheck Cond: (orders.user_id = 42)
                     Heap Blocks: exact=13
                     Buffers: shared hit=16
                     ->  Bitmap Index Scan on orders_by_user  (cost=0.00..5.17 rows=100 width=0) (actual time=0.008..0.008 rows=100 loops=1)
                           Index Cond: (orders.user_id = 42)
                           Buffers: shared hit=3
 Planning Time: 0.082 ms
 Execution Time: 0.161 ms

Here is the execution plan with the execution statistics. We have read 3 pages from USERS to get our USER_ID (Index Only Scan) and navigated, with Nested Loop, to ORDERS and get the 100 orders for this user. This has read 3 pages from the index (Bitmap Index Scan) and 13 pages from the table. That’s a total of 19 pages. Those pages are 8k blocks.

4x initial size scales with same performance: 19 page reads

Let’s increase the size of the tables by a factor 4. I change the “generate_series(1,100)” to “generate_series(1,200)” in the USERS generations and run the same. That generates 40000 users and 4 million orders.


select relkind,relname,reltuples,relpages,lpad(pg_size_pretty(relpages::bigint*8*1024),20) "relpages * 8k" from pg_class natural join (select
oid relnamespace,nspname from pg_namespace) nsp where nspname='public' order by relpages;

 relkind |       relname       |  reltuples   | relpages |    relpages * 8k
---------+---------------------+--------------+----------+----------------------
 S       | orders_order_id_seq |            1 |        1 |           8192 bytes
 S       | users_user_id_seq   |            1 |        1 |           8192 bytes
 i       | users_pkey          |        40000 |      112 |               896 kB
 r       | users               |        40000 |      485 |              3880 kB
 i       | orders_pkey         |        4e+06 |    10969 |                86 MB
 i       | orders_by_user      |        4e+06 |    10985 |                86 MB
 r       | orders              | 3.999995e+06 |    52617 |               411 MB

explain (analyze,verbose,costs,buffers)
select order_month,sum(amount)
from
  (
  select date_trunc('month',ORDER_DATE) order_month, USER_ID, AMOUNT
  from ORDERS join USERS using(USER_ID)
  ) user_orders
where USER_ID=42
group by order_month
;
explain (analyze,verbose,costs,buffers)
select order_month,sum(amount)
from
  (
  select date_trunc('month',ORDER_DATE) order_month, USER_ID, AMOUNT
  from ORDERS join USERS using(USER_ID)
  ) user_orders
where USER_ID=42
group by order_month
;
                                                                   QUERY PLAN                                                                 
------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=417.76..418.69 rows=10 width=40) (actual time=0.116..0.143 rows=9 loops=1)
   Output: (date_trunc('month'::text, orders.order_date)), sum(orders.amount)
   Group Key: (date_trunc('month'::text, orders.order_date))
   Buffers: shared hit=19
   ->  Sort  (cost=417.76..418.02 rows=104 width=16) (actual time=0.108..0.115 rows=100 loops=1)
         Output: (date_trunc('month'::text, orders.order_date)), orders.amount
         Sort Key: (date_trunc('month'::text, orders.order_date))
         Sort Method: quicksort  Memory: 32kB
         Buffers: shared hit=19
         ->  Nested Loop  (cost=5.53..414.27 rows=104 width=16) (actual time=0.044..0.091 rows=100 loops=1)
               Output: date_trunc('month'::text, orders.order_date), orders.amount
               Buffers: shared hit=19
               ->  Index Only Scan using users_pkey on public.users  (cost=0.29..4.31 rows=1 width=8) (actual time=0.006..0.007 rows=1 loops=1)
                     Output: users.user_id
                     Index Cond: (users.user_id = 42)
                     Heap Fetches: 0
                     Buffers: shared hit=3
               ->  Bitmap Heap Scan on public.orders  (cost=5.24..408.66 rows=104 width=24) (actual time=0.033..0.056 rows=100 loops=1)
                     Output: orders.order_id, orders.order_date, orders.amount, orders.user_id, orders.description
                     Recheck Cond: (orders.user_id = 42)
                     Heap Blocks: exact=13
                     Buffers: shared hit=16
                     ->  Bitmap Index Scan on orders_by_user  (cost=0.00..5.21 rows=104 width=0) (actual time=0.029..0.029 rows=100 loops=1)
                           Index Cond: (orders.user_id = 42)
                           Buffers: shared hit=3
 Planning Time: 0.084 ms
 Execution Time: 0.168 ms
(27 rows)

Time: 0.532 ms

Look how we scaled here: I multiplied the size of the tables by 4x and I have exactly the same cost: 3 pages from each index and 13 from the table. This is the beauty of B*Tree: the cost depends only on the height of the tree and not the size of the table. And you can increase the table exponentially before the height of the index increases.

16x initial size scales with 21/19=1.1 cost factor

Let’s go further and x4 the size of the tables again. I run with “generate_series(1,400)” in the USERS generations and run the same. That generates 160 thousand users and 16 million orders.


select relkind,relname,reltuples,relpages,lpad(pg_size_pretty(relpages::bigint*8*1024),20) "relpages * 8k" from pg_class natural join (select
oid relnamespace,nspname from pg_namespace) nsp where nspname='public' order by relpages;
 relkind |       relname       |  reltuples   | relpages |    relpages * 8k
---------+---------------------+--------------+----------+----------------------
 S       | orders_order_id_seq |            1 |        1 |           8192 bytes
 S       | users_user_id_seq   |            1 |        1 |           8192 bytes
 i       | users_pkey          |       160000 |      442 |              3536 kB
 r       | users               |       160000 |     1925 |                15 MB
 i       | orders_pkey         |      1.6e+07 |    43871 |               343 MB
 i       | orders_by_user      |      1.6e+07 |    43935 |               343 MB
 r       | orders              | 1.600005e+07 |   213334 |              1667 MB

explain (analyze,verbose,costs,buffers)
select order_month,sum(amount)
from
  (
  select date_trunc('month',ORDER_DATE) order_month, USER_ID, AMOUNT
  from ORDERS join USERS using(USER_ID)
  ) user_orders
where USER_ID=42
group by order_month
;
                                                                   QUERY PLAN                                                                 
------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=567.81..569.01 rows=10 width=40) (actual time=0.095..0.120 rows=8 loops=1)
   Output: (date_trunc('month'::text, orders.order_date)), sum(orders.amount)
   Group Key: (date_trunc('month'::text, orders.order_date))
   Buffers: shared hit=21
   ->  Sort  (cost=567.81..568.16 rows=140 width=17) (actual time=0.087..0.093 rows=100 loops=1)
         Output: (date_trunc('month'::text, orders.order_date)), orders.amount
         Sort Key: (date_trunc('month'::text, orders.order_date))
         Sort Method: quicksort  Memory: 32kB
         Buffers: shared hit=21
         ->  Nested Loop  (cost=6.06..562.82 rows=140 width=17) (actual time=0.026..0.070 rows=100 loops=1)
               Output: date_trunc('month'::text, orders.order_date), orders.amount
               Buffers: shared hit=21
               ->  Index Only Scan using users_pkey on public.users  (cost=0.42..4.44 rows=1 width=8) (actual time=0.007..0.007 rows=1 loops=1)
                     Output: users.user_id
                     Index Cond: (users.user_id = 42)
                     Heap Fetches: 0
                     Buffers: shared hit=4
               ->  Bitmap Heap Scan on public.orders  (cost=5.64..556.64 rows=140 width=25) (actual time=0.014..0.035 rows=100 loops=1)
                     Output: orders.order_id, orders.order_date, orders.amount, orders.user_id, orders.description
                     Recheck Cond: (orders.user_id = 42)
                     Heap Blocks: exact=13
                     Buffers: shared hit=17
                     ->  Bitmap Index Scan on orders_by_user  (cost=0.00..5.61 rows=140 width=0) (actual time=0.010..0.010 rows=100 loops=1)
                           Index Cond: (orders.user_id = 42)
                           Buffers: shared hit=4
 Planning Time: 0.082 ms
 Execution Time: 0.159 ms

Yes, by increasing the table size a lot the index may have to split the root block to add a new branch level. The consequence is minimal: two additional pages to read here. The additional execution time is in tens of microseconds when cached in RAM, up to milliseconds if from mechanical disk. With 10x or 100x larger tables, you will do some physical I/O and only the top index branches will be in memory. You can expect about about 20 I/O calls then. With SSD on Fiber Channel (like with AWS RDS PostgreSQL for example), this will still be in single-digit millisecond. The cheapest AWS EBS Provisioned IOPS for RDS is 1000 IOPS then you can estimate the number of queries per second you can run. The number of pages here (“Buffers”) is the right metric to estimate the cost of the query. To be compared with DynamoDB RCU.

64x initial size scales with 23/19=1.2 cost factor

Last test for me, but you can go further, I change to “generate_series(1,800)” in the USERS generations and run the same. That generates 640 thousand users and 64 million orders.



select relkind,relname,reltuples,relpages,lpad(pg_size_pretty(relpages::bigint*8*1024),20) "relpages * 8k" from pg_class natural join (select
oid relnamespace,nspname from pg_namespace) nsp where nspname='public' order by relpages;

 relkind |       relname       |   reltuples   | relpages |    relpages * 8k
---------+---------------------+---------------+----------+----------------------
 S       | orders_order_id_seq |             1 |        1 |           8192 bytes
 S       | users_user_id_seq   |             1 |        1 |           8192 bytes
 i       | users_pkey          |        640000 |     1757 |                14 MB
 r       | users               |        640000 |     7739 |                60 MB
 i       | orders_pkey         |       6.4e+07 |   175482 |              1371 MB
 i       | orders_by_user      |       6.4e+07 |   175728 |              1373 MB
 r       | orders              | 6.4000048e+07 |   853334 |              6667 MB


explain (analyze,verbose,costs,buffers)
select order_month,sum(amount)
from
  (
  select date_trunc('month',ORDER_DATE) order_month, USER_ID, AMOUNT
  from ORDERS join USERS using(USER_ID)
  ) user_orders
where USER_ID=42
group by order_month
;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=1227.18..1227.33 rows=10 width=40) (actual time=0.102..0.105 rows=7 loops=1)
   Output: (date_trunc('month'::text, orders.order_date)), sum(orders.amount)
   Group Key: date_trunc('month'::text, orders.order_date)
   Buffers: shared hit=23
   ->  Nested Loop  (cost=7.36..1225.65 rows=306 width=17) (actual time=0.027..0.072 rows=100 loops=1)
         Output: date_trunc('month'::text, orders.order_date), orders.amount
         Buffers: shared hit=23
         ->  Index Only Scan using users_pkey on public.users  (cost=0.42..4.44 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=1)
               Output: users.user_id
               Index Cond: (users.user_id = 42)
               Heap Fetches: 0
               Buffers: shared hit=4
         ->  Bitmap Heap Scan on public.orders  (cost=6.94..1217.38 rows=306 width=25) (actual time=0.015..0.037 rows=100 loops=1)
               Output: orders.order_id, orders.order_date, orders.amount, orders.user_id, orders.description
               Recheck Cond: (orders.user_id = 42)
               Heap Blocks: exact=15
               Buffers: shared hit=19
               ->  Bitmap Index Scan on orders_by_user  (cost=0.00..6.86 rows=306 width=0) (actual time=0.011..0.011 rows=100 loops=1)
                     Index Cond: (orders.user_id = 42)
                     Buffers: shared hit=4
 Planning Time: 0.086 ms
 Execution Time: 0.134 ms
(22 rows)

Time: 0.521 ms

You see how the cost increases slowly, with two additional pages to read here.

Please, test with more rows. When looking at the time (less than 1 millisecond), keep in mind that it is CPU time here as all pages are in memory cache. On purpose, because the article mentions CPU and RAM (“joins require a lot of CPU and memory”). Anyway, 20 disk reads will still be within the millisecond on modern storage.

Time complexity of B*Tree is actually very light

Let’s get back to the “Big O” notation. We are definitely not “O (M + N)”. This would have been without any index, where all pages from all tables would have been full scanned. Any database maintains hashed or sorted structures to improve high selectivity queries. This has nothing to do with SQL vs. NoSQL or with RDBMS vs. Hierarchichal DB. In DynamoDB this structure on primary key is HASH partitioning (and additional sorting). In RDBMS the most common structure is a sorted index (with the possibility of additional partitioning, zone maps, clustering,…). We are not in “O(log(N))+Nlog(M)” but more like “O(log(N))+log(M)” because the size of the driving table (N) has nothing to do on the inner loop cost. The first filtering, the one with “O(log(N))”, has filtered out those N rows. Using “N” and ignoring the outer selectivity is a mistake. Anyway, this O(log) for B*Tree is really scalable as we have seen because the base of this log() function is really small (thanks to large branch blocks – 8KB).

Anyway, the point is not there. As you have seen, the cost of Nested Loop is in the blocks that are fetched from the inner loop, and this depends on the selectivity and the clustering factor (the table physical order correlation with the index order), and some prefetching techniques. This O(log) for going through the index is often insignificant on Range Scans.

40 years of RDBMS optimizations

If you have a database size that is several order of magnitude from those million rows, and where this millisecond latency is a problem, the major RDBMS offer many features and structures to go further in performance and predictability of response time. As we have seen, the cost of this query is not about Index, Nested Loop or Group By. It is about the scattered items: ingested by date but queried by user.

Through its 40 years of optimizing OLTP applications, the Oracle Database have a lot of features, with their popularity depending on the IT trends. From the first versions, in the 80’s, because the hierarchical databases were still in people’s mind, and the fear of “joins are expensive” was already there, Oracle has implemented a structure to store the tables together, pre-joined. This is called CLUSTER. It was, from the beginning, indexed with a sorted structure (INDEX CLUSTER) or a hashed structure (HASH CLUSTER). Parallel Query was quickly implemented to scale-up and, with Parallel Server now known as RAC, to scale-out. Partitioning was introduced to scale further, again with sorted (PARTITION by RANGE) or hashed (PARTITION BY HASH) segments. This can scale further the index access (LOCAL INDEX with partition pruning). And it also scales joins (PARTITION-WISE JOIN) to reduce CPU and RAM to join large datasets. When more agility was needed, B*Tree sorted structures were the winner and Index Organized Tables were introduced to cluster data (to avoid the most expensive operation as we have seen in the examples above). But Heap Tables and B*Tree are still the winners given their agility and because the Join operations were rarely a bottleneck. Currently, partitioning has improved (even scaling horizontally beyond the clusters with sharding), as well as data clustering (Attribute Clustering, Zone Maps and Storage Index). And there’s always the possibility for Index Only access, just by adding more columns to the index, removing the major cost of table access. And materialized views can also act as pre-built join index and will also reduce the CPU and RAM required for GROUP BY aggregation.

I listed a lot of Oracle features, but Microsoft SQL Server has also many similar features. Clustered tables, covering indexes, Indexed Views, and of course partitioning. Open source databases are evolving a lot. PostgreSQL has partitioning, parallel query, clustering,.. Still Open Source, you can web-scale it to a distributed database with YugabyteDB. MySQL is evolving a lot recently. I’ll not list all databases and all features. You can port the testcase to other databases easily.

Oh, and by the way, if you want to test it on Oracle Database you will quickly realize that the query from the article may not even do any Join operation. Because when you query with the USER_ID you probably don’t need to get back to the USERS table. And thanks to the declared foreign key, the optimizer knows that the join is not necessary: https://dbfiddle.uk/?rdbms=oracle_18&fiddle=a5afeba2fdb27dec7533545ab0a6eb0e.

NoSQL databases are good for many use-cases. But remember that it is a key-value optimized data store. When you start to split a document into multiple entities with their own keys, remember that relational databases were invented for that. I blogged about another myth recently: The myth of NoSQL (vs. RDBMS) agility: adding attributes. All feedback welcome, here or on Twitter:

A new blog post on another myth in NoSQL vs. RDBMS:
“joins dont scale”https://t.co/u4aBE4HXb7

— Franck Pachot (@FranckPachot) July 5, 2020


And Twitter has also Alex DeBrie interesting thoughts about it:

Franck Pachot (@FranckPachot ) is an RDBMS expert and wrote a really nice response to an earlier post I wrote about how DynamoDB scales where relational databases don't. I highly recommend reading it.

Link below, with 3 thoughts from me to follow.https://t.co/DjOK3RWJZc

— Alex DeBrie (@alexbdebrie) July 8, 2020

Cet article The myth of NoSQL (vs. RDBMS) “joins dont scale” est apparu en premier sur Blog dbi services.

Oracle ACFS: “du” vs. “df” and “acfsutil info”

Fri, 2020-07-03 07:19
By Franck Pachot

.
This is a demo about Oracle ACFS snapshots, and how to understand the used and free space, as displayed by “df”, when there are modifications in the base parent or the snapshot children. The important concept to understand is that, when you take a snapshot, any modification to the child or parent will


[grid@cloud ~]$ asmcmd lsdg DATAC1

State    Type  Rebal  Sector  Logical_Sector  Block       AU   Total_MB    Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  HIGH  N         512             512   4096  4194304  214991104  191881152         11943936        59979016              0             Y  DATAC1/

On a database machine in the Oracle Cloud I have a diskgroup with lot of free space. I’ll use this DATAC1 diskgroup to store my ACFS filesystem. the size in MegaByte is not easy to read.
I can have a friendly overview from acfsutil with human readable sizes (in TeraByte there).


[grid@cloud ~]$ acfsutil info storage -u TB -l DATAC1

Diskgroup: DATAC1 (83% free)
  total disk space:         205.03
  ASM file space:            22.04
  total free space:         182.99
  net free with mirroring:   60.99
  usable after reservation:  57.20
  redundancy type:          HIGH

    Total space used by ASM non-volume files:
      used:                     19.03
      mirror used:               6.34

    volume: /dev/asm/dump-19
      total:                     1.00
      free:                      0.22
      redundancy type:         high
      file system:             /u03/dump
----
unit of measurement: TB

There’s already a volume here (/dev/asm/dump-19) which is named ‘DUMP’ and mounted as an ACFS filesystem (/u03/dump) but I’ll create a now one for this demo and will remove.

Logical volume: asmcmd volcreate

I have a diskgroup which exposes disk space to ASM. The first thing to do, in order to build a filesystem on it, is to create a logical volume, which is called ADVM for “ASM Dynamic Volume Manager”.


[grid@cloud ~]$ asmcmd volcreate -G DATAC1 -s 100G MYADVM

I’m creating a 100GB new volume, identified by its name (MYADVM) within the diskgroup (DATAC1). The output is not verbose at all but I can check it with volinfo.


[grid@cloud ~]$ asmcmd volinfo -G DATAC1 -a

Diskgroup Name: DATAC1

         Volume Name: DUMP
         Volume Device: /dev/asm/dump-19
         State: ENABLED
         Size (MB): 1048576
         Resize Unit (MB): 512
         Redundancy: HIGH
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /u03/dump

         Volume Name: MYADVM
         Volume Device: /dev/asm/myadvm-19
         State: ENABLED
         Size (MB): 102400
         Resize Unit (MB): 512
         Redundancy: HIGH
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage:
         Mountpath:

With “-a” I show all volumes on the diskgroup, but I could have mentioned the volume name. This is what we will do later.

“Usage” is empty because there’s no filesystem created yet, and “Mountpath” is empty because it is not mounted. The information I need is the Volume Device (/dev/asm/myadvm-19) as I’ll have access to it from the OS.


[grid@cloud ~]$ lsblk -a /dev/asm/myadvm-19

NAME          MAJ:MIN  RM  SIZE RO TYPE MOUNTPOINT
asm!myadvm-19 248:9731  0  100G  0 disk

[grid@cloud ~]$ lsblk -t /dev/asm/myadvm-19

NAME          ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED    RQ-SIZE   RA
asm!myadvm-19         0    512      0     512     512    1 deadline     128  128

[grid@cloud ~]$ fdisk -l /dev/asm/myadvm-19

Disk /dev/asm/myadvm-19: 107.4 GB, 107374182400 bytes
255 heads, 63 sectors/track, 13054 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Visible from the OS, I have a 100GB volume with nothing there: I need to format it.

Filesystem: Linux mkfs

[grid@cloud ~]$ grep -v nodev /proc/filesystems

        iso9660
        ext4
        fuseblk
        acfs

ACFS is a filesystem supported by the kernel. I can use “mkfs” to format the volume to ACFS.


[grid@cloud ~]$ mkfs.acfs -f /dev/asm/myadvm-19

mkfs.acfs: version                   = 18.0.0.0.0
mkfs.acfs: on-disk version           = 46.0
mkfs.acfs: volume                    = /dev/asm/myadvm-19
mkfs.acfs: volume size               = 107374182400  ( 100.00 GB )
mkfs.acfs: Format complete.

I used “-f” because my volume was already formatted from a previous test. I have now a 100GB filesystem on this volume.


[grid@cloud ~]$ asmcmd volinfo -G DATAC1 MYADVM

Diskgroup Name: DATAC1

         Volume Name: MYADVM
         Volume Device: /dev/asm/myadvm-19
         State: ENABLED
         Size (MB): 102400
         Resize Unit (MB): 512
         Redundancy: HIGH
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath:

The “Usage” shows that ASM knows that the ADVM volume holds an ACFS filesystem.


[opc@cloud ~]$ sudo mkdir -p /myacfs

I’ll mount this filesystem to /myacfs


[opc@cloud ~]$ sudo mount -t acfs /dev/asm/myadvm-19 /myacfs

[opc@cloud ~]$ mount | grep /myacfs

/dev/asm/myadvm-19 on /myacfs type acfs (rw)

[opc@cloud ~]$ df -Th /myacfs

Filesystem         Type  Size  Used Avail Use% Mounted on
/dev/asm/myadvm-19 acfs  100G  448M  100G   1% /myacfs

[opc@cloud ~]$ ls -alrt /myacfs

total 100
drwxr-xr-x 30 root root  4096 Jun 23 10:40 ..
drwxr-xr-x  4 root root 32768 Jun 23 15:48 .
drwx------  2 root root 65536 Jun 23 15:48 lost+found

[opc@cloud ~]$ sudo umount /myacfs

I wanted to show that you can mount it from the OS but, as we have Grid Infrastructure, we usually want to manage it as a cluster resource.


[opc@cloud ~]$ sudo acfsutil registry -a /dev/asm/myadvm-19 /myacfs -u grid

acfsutil registry: mount point /myacfs successfully added to Oracle Registry

[opc@cloud ~]$ ls -alrt /myacfs

total 100
drwxr-xr-x 30 root root      4096 Jun 23 10:40 ..
drwxr-xr-x  4 grid oinstall 32768 Jun 23 15:48 .
drwx------  2 root root     65536 Jun 23 15:48 lost+found

The difference here is that I’ve set the owner of the filesystem to “grid” as that’s the user I’ll use for the demo.

Used and free space

[grid@cloud ~]$ asmcmd volinfo -G DATAC1 MYADVM

Diskgroup Name: DATAC1

         Volume Name: MYADVM
         Volume Device: /dev/asm/myadvm-19
         State: ENABLED
         Size (MB): 102400
         Resize Unit (MB): 512
         Redundancy: HIGH
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /myacfs

ADVM has the information about the mount path but now to interact with it I’ll use “acfsutil” for ACFS features or the standard Linux commands on filesystems.


[grid@cloud ~]$ du -ah /myacfs

64K     /myacfs/lost+found
96K     /myacfs

I have no files there: only 96 KB used.


[grid@cloud ~]$ df -Th /myacfs

Filesystem         Type  Size  Used Avail Use% Mounted on
/dev/asm/myadvm-19 acfs  100G  688M  100G   1% /myacfs

the whole size of 100GB is available.


[grid@cloud ~]$ acfsutil info storage -u GB -l DATAC1

Diskgroup: DATAC1 (83% free)
  total disk space:         209952.25
  ASM file space:           22864.75
  total free space:         187083.87
  net free with mirroring:  62361.29
  usable after reservation: 58473.23
  redundancy type:          HIGH

    Total space used by ASM non-volume files:
      used:                    19488.04
      mirror used:             6496.01

    volume: /dev/asm/myadvm-19
      total:                   100.00
      free:                     99.33
      redundancy type:         high
      file system:             /myacfs
...
----
unit of measurement: GB

“acfs info storage” shows all volumes in the diskgroup (I removed the output for the ‘DUMP’ one here)


[grid@cloud ~]$ acfsutil info fs /myacfs
/myacfs
    ACFS Version: 18.0.0.0.0
    on-disk version:       47.0
    compatible.advm:       18.0.0.0.0
    ACFS compatibility:    18.0.0.0.0
    flags:        MountPoint,Available
    mount time:   Tue Jun 23 15:52:35 2020
    mount sequence number: 8
    allocation unit:       4096
    metadata block size:   4096
    volumes:      1
    total size:   107374182400  ( 100.00 GB )
    total free:   106652823552  (  99.33 GB )
    file entry table allocation: 393216
    primary volume: /dev/asm/myadvm-19
        label:
        state:                 Available
        major, minor:          248, 9731
        logical sector size:   512
        size:                  107374182400  ( 100.00 GB )
        free:                  106652823552  (  99.33 GB )
        metadata read I/O count:         1203
        metadata write I/O count:        10
        total metadata bytes read:       4927488  (   4.70 MB )
        total metadata bytes written:    40960  (  40.00 KB )
        ADVM diskgroup:        DATAC1
        ADVM resize increment: 536870912
        ADVM redundancy:       high
        ADVM stripe columns:   8
        ADVM stripe width:     1048576
    number of snapshots:  0
    snapshot space usage: 0  ( 0.00 )
    replication status: DISABLED
    compression status: DISABLED

“acfsutil info fs” is the best way to have all information and here it shows the same as what we see from the OS: size=100GB and free=99.33GB

Add a file

[grid@cloud ~]$ dd of=/myacfs/file.tmp if=/dev/zero bs=1G count=42

42+0 records in
42+0 records out
45097156608 bytes (45 GB) copied, 157.281 s, 287 MB/s

I have created a 42 GB file in my filesystem. And now will compare the size info.


[grid@cloud ~]$ du -ah /myacfs

64K     /myacfs/lost+found
43G     /myacfs/file.tmp
43G     /myacfs

I see one additional file with 43GB


[grid@cloud ~]$ df -Th /myacfs

Filesystem         Type  Size  Used Avail Use% Mounted on
/dev/asm/myadvm-19 acfs  100G   43G   58G  43% /myacfs

The used space that was 688 MB is now 43GB, the free space which was 100 GB is now 58 GB and the 1% usage is now 43% – this is the exact math (rounded to next GB) as my file had to allocate extents for all its data.


[grid@cloud ~]$ acfsutil info storage -u GB -l DATAC1

Diskgroup: DATAC1 (83% free)
  total disk space:         209952.25
  ASM file space:           22864.75
  total free space:         187083.87
  net free with mirroring:  62361.29
  usable after reservation: 58473.23
  redundancy type:          HIGH

    Total space used by ASM non-volume files:
      used:                    19488.04
      mirror used:             6496.01

    volume: /dev/asm/myadvm-19
      total:                   100.00
      free:                     57.29
      redundancy type:         high
      file system:             /myacfs

Here the “free:” that was 99.33 GB before is now 57.29 GB which matches what we see from df.
total free: 106652823552 ( 99.33 GB )


[grid@cloud ~]$ acfsutil info fs /myacfs

/myacfs
    ACFS Version: 18.0.0.0.0
    on-disk version:       47.0
    compatible.advm:       18.0.0.0.0
    ACFS compatibility:    18.0.0.0.0
    flags:        MountPoint,Available
    mount time:   Tue Jun 23 15:52:35 2020
    mount sequence number: 8
    allocation unit:       4096
    metadata block size:   4096
    volumes:      1
    total size:   107374182400  ( 100.00 GB )
    total free:   61513723904  (  57.29 GB )
    file entry table allocation: 393216
    primary volume: /dev/asm/myadvm-19
        label:
        state:                 Available
        major, minor:          248, 9731
        logical sector size:   512
        size:                  107374182400  ( 100.00 GB )
        free:                  61513723904  (  57.29 GB )
        metadata read I/O count:         1686
        metadata write I/O count:        130
        total metadata bytes read:       8687616  (   8.29 MB )
        total metadata bytes written:    3555328  (   3.39 MB )
        ADVM diskgroup:        DATAC1
        ADVM resize increment: 536870912
        ADVM redundancy:       high
        ADVM stripe columns:   8
        ADVM stripe width:     1048576
    number of snapshots:  0
    snapshot space usage: 0  ( 0.00 )
    replication status: DISABLED
    compression status: DISABLED

What has changed here is only the “total free:” from 99.33 GB to 57.29 GB

Create a snapshot

For the moment, with no snapshot and no compression, the size allocated in the filesystem (as the “df” used) is the same as the size of the files (as the “du” file size).


[grid@cloud ~]$ acfsutil snap info /myacfs

    number of snapshots:  0
    snapshot space usage: 0  ( 0.00 )

The “acfsutil info fs” shows no snapshots, and “acfsutil snap info” shows the same but will give more details where I’ll have created snapshots.


[grid@cloud ~]$ acfsutil snap create -w S1 /myacfs

acfsutil snap create: Snapshot operation is complete.

This has created a read-write snapshot of the filesystem.


[grid@cloud ~]$ du -ah /myacfs

64K     /myacfs/lost+found
43G     /myacfs/file.tmp
43G     /myacfs

Nothing different here as this shows only the base. The snapshots are in a hidden .ACFS that is not listed but can be accessed when mentioning the path.


[grid@cloud ~]$ du -ah /myacfs/.ACFS

32K     /myacfs/.ACFS/.fileid
32K     /myacfs/.ACFS/repl
43G     /myacfs/.ACFS/snaps/S1/file.tmp
43G     /myacfs/.ACFS/snaps/S1
43G     /myacfs/.ACFS/snaps
43G     /myacfs/.ACFS

Here I see my 42 GB file in the snapshot. But, as I did no modifications there, it shares the same extents on disk. This means that even if I see two files of 42 GB there is only 42 GB allocated for those two virtual copies.


[grid@cloud ~]$ df -Th /myacfs

Filesystem         Type  Size  Used Avail Use% Mounted on

/dev/asm/myadvm-19 acfs  100G   43G   58G  43% /myacfs

As there are no additional extents allocated for this snapshot, “df” shows the same as before. Here we start to see a difference between the physical “df” size and the virtual “du” size.


[grid@cloud ~]$ acfsutil info storage -u GB -l DATAC1

Diskgroup: DATAC1 (83% free)
  total disk space:         209952.25
  ASM file space:           22864.75
  total free space:         187083.87
  net free with mirroring:  62361.29
  usable after reservation: 58473.23
  redundancy type:          HIGH

    Total space used by ASM non-volume files:
      used:                    19488.04
      mirror used:             6496.01

    volume: /dev/asm/myadvm-19
      total:                   100.00
      free:                     57.29
      redundancy type:         high
      file system:             /myacfs
        snapshot: S1 (/myacfs/.ACFS/snaps/S1)
          used:          0.00
          quota limit:   none
...
----
unit of measurement: GB

Here I have additional information for my snapshot: “snapshot: S1 (/myacfs/.ACFS/snaps/S1)” with 0 GB used because I did not modify anything in the snapshot yet. The volume still has the same free space.


[grid@cloud ~]$ acfsutil snap info /myacfs
    number of snapshots:  1
    snapshot space usage: 393216  ( 384.00 KB )

The new thing here is “number of snapshots: 1” with nearly no space usage (384 KB)


[grid@cloud ~]$ acfsutil info fs /myacfs
/myacfs
    ACFS Version: 18.0.0.0.0
    on-disk version:       47.0
    compatible.advm:       18.0.0.0.0
    ACFS compatibility:    18.0.0.0.0
    flags:        MountPoint,Available
    mount time:   Tue Jun 23 15:52:35 2020
    mount sequence number: 8
    allocation unit:       4096
    metadata block size:   4096
    volumes:      1
    total size:   107374182400  ( 100.00 GB )
    total free:   61513723904  (  57.29 GB )
    file entry table allocation: 393216
    primary volume: /dev/asm/myadvm-19
        label:
        state:                 Available
        major, minor:          248, 9731
        logical sector size:   512
        size:                  107374182400  ( 100.00 GB )
        free:                  61513723904  (  57.29 GB )
        metadata read I/O count:         4492
        metadata write I/O count:        236
        total metadata bytes read:       24363008  (  23.23 MB )
        total metadata bytes written:    12165120  (  11.60 MB )
        ADVM diskgroup:        DATAC1
        ADVM resize increment: 536870912
        ADVM redundancy:       high
        ADVM stripe columns:   8
        ADVM stripe width:     1048576
    number of snapshots:  1
    snapshot space usage: 393216  ( 384.00 KB )
    replication status: DISABLED
    compression status: DISABLED

Here I have more detail: those 384 KB are for the file allocation table only.

Write on snapshot

My snapshot shows no additional space but that will not stay as the goal is to do some modifications, within the same file, like a database would do on its datafiles as soon as it is opened read-write.


[grid@cloud ~]$ dd of=/myacfs/.ACFS/snaps/S1/file.tmp if=/dev/zero bs=1G count=10 conv=notrunc

10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 58.2101 s, 184 MB/s

I’ve overwritten 10GB within the 42 GB file but the remaining 32 are still the same (that’s what the conv=notrunc is doing – not truncating the end of file).


[grid@cloud ~]$ du -ah /myacfs

64K     /myacfs/lost+found
43G     /myacfs/file.tmp
43G     /myacfs

Nothing has changed at the base level because I modified only the file in the snapshot.



[grid@cloud ~]$ du -ah /myacfs/.ACFS

32K     /myacfs/.ACFS/.fileid
32K     /myacfs/.ACFS/repl
43G     /myacfs/.ACFS/snaps/S1/file.tmp
43G     /myacfs/.ACFS/snaps/S1
43G     /myacfs/.ACFS/snaps
43G     /myacfs/.ACFS

Nothing has changed either in the snapshot because the file is still the same size.


[grid@cloud ~]$ df -Th /myacfs

Filesystem         Type  Size  Used Avail Use% Mounted on
/dev/asm/myadvm-19 acfs  100G   53G   48G  53% /myacfs

The filesystem, has increased by the amount I modified in the snapshot: I had 58GB available and now only 48GB. Because the 10GB of extents that were shared between the shanpshot child and the base (the snapshot parent) are not the same anymore and new extents had to be allocated for the snapshot. This is similar to copy-on-write.

That’s the main point of this blog post: you don’t see this new allocation with “ls” or “du” but only with “df” or the filesystem specific tools.


[grid@cloud ~]$ acfsutil info storage -u GB -l DATAC1

Diskgroup: DATAC1 (83% free)
  total disk space:         209952.25
  ASM file space:           22864.75
  total free space:         187083.87
  net free with mirroring:  62361.29
  usable after reservation: 58473.23
  redundancy type:          HIGH

    Total space used by ASM non-volume files:
      used:                    19488.04
      mirror used:             6496.01

    volume: /dev/asm/myadvm-19
      total:                   100.00
      free:                     47.09
      redundancy type:         high
      file system:             /myacfs
        snapshot: S1 (/myacfs/.ACFS/snaps/S1)
          used:         10.10
          quota limit:   none

----
unit of measurement: GB

the volume free space has decreased fro 57.29 to 47.09 GB


[grid@cloud ~]$ acfsutil snap info /myacfs
snapshot name:               S1
snapshot location:           /myacfs/.ACFS/snaps/S1
RO snapshot or RW snapshot:  RW
parent name:                 /myacfs
snapshot creation time:      Fri Jul  3 08:01:52 2020
file entry table allocation: 393216   ( 384.00 KB )
storage added to snapshot:   10846875648   (  10.10 GB )

    number of snapshots:  1
    snapshot space usage: 10846875648  (  10.10 GB )

10 GB has been added to the snapshot for the extents that are different than the parent.


[grid@cloud ~]$ acfsutil info fs /myacfs

/myacfs
    ACFS Version: 18.0.0.0.0
    on-disk version:       47.0
    compatible.advm:       18.0.0.0.0
    ACFS compatibility:    18.0.0.0.0
    flags:        MountPoint,Available
    mount time:   Fri Jul  3 07:33:19 2020
    mount sequence number: 6
    allocation unit:       4096
    metadata block size:   4096
    volumes:      1
    total size:   107374182400  ( 100.00 GB )
    total free:   50566590464  (  47.09 GB )
    file entry table allocation: 393216
    primary volume: /dev/asm/myadvm-19
        label:
        state:                 Available
        major, minor:          248, 9731
        logical sector size:   512
        size:                  107374182400  ( 100.00 GB )
        free:                  50566590464  (  47.09 GB )
        metadata read I/O count:         9447
        metadata write I/O count:        1030
        total metadata bytes read:       82587648  (  78.76 MB )
        total metadata bytes written:    89440256  (  85.30 MB )
        ADVM diskgroup:        DATAC1
        ADVM resize increment: 536870912
        ADVM redundancy:       high
        ADVM stripe columns:   8
        ADVM stripe width:     1048576
    number of snapshots:  1
    snapshot space usage: 10846875648  (  10.10 GB )
    replication status: DISABLED

The file entry table allocation has not changed. It was pointing to the parent for all extents before. Now, 10GB of extents are pointing to the snapshot child ones.

Write on parent

If I overwrite a different part on the parent, it will need to allocate new extents as those extents are share with the snapshot.


[grid@cloud ~]$ dd of=/myacfs/file.tmp if=/dev/zero bs=1G count=10 conv=notrunc seek=20

10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 35.938 s, 299 MB/s

this has written 10GB again, but at the 20GB offset, not overlapping the range of extents modified in the snapshot.


[grid@cloud ~]$ df -Th /myacfs
Filesystem         Type  Size  Used Avail Use% Mounted on
/dev/asm/myadvm-19 acfs  100G   63G   38G  63% /myacfs

Obviously 10GB had to be allocated for those new extents.


[grid@cloud ~]$ acfsutil snap info /myacfs
snapshot name:               S1
snapshot location:           /myacfs/.ACFS/snaps/S1
RO snapshot or RW snapshot:  RW
parent name:                 /myacfs
snapshot creation time:      Fri Jul  3 09:17:10 2020
file entry table allocation: 393216   ( 384.00 KB )
storage added to snapshot:   10846875648   (  10.10 GB )


    number of snapshots:  1
    snapshot space usage: 21718523904  (  20.23 GB )

Those 10GB modified on the base are allocated for the snapshot, because what was a pointer to the parent is now a copy of extents in the state they were before the write.

I’m not going into details of copy-on-write or redirect-on-write. You can see that at file level with “acfsutil info file” and Ludovico Caldara made a great demo of it a few years ago:

Now what happens if I modify, in the parent, the same extents that have been modified in the snapshot? I don’t need a copy of the previous version of them, right?


[grid@cloud ~]$ dd of=/myacfs/file.tmp if=/dev/zero bs=1G count=10 conv=notrunc

10+0 records in
10+0 records out
10737418240 bytes (11 GB) copied, 34.9208 s, 307 MB/s

The first 10GB of this file, since the snapshot was taken, have now been modified in the base and in all snapshots (only one here).


[grid@cloud ~]$ df -Th /myacfs
Filesystem         Type  Size  Used Avail Use% Mounted on
/dev/asm/myadvm-19 acfs  100G   74G   27G  74% /myacfs

I had 10GB allocated again for this.


[grid@cloud ~]$ acfsutil snap info /myacfs
snapshot name:               S1
snapshot location:           /myacfs/.ACFS/snaps/S1
RO snapshot or RW snapshot:  RW
parent name:                 /myacfs
snapshot creation time:      Fri Jul  3 09:17:10 2020
file entry table allocation: 393216   ( 384.00 KB )
storage added to snapshot:   10846875648   (  10.10 GB )

    number of snapshots:  1
    snapshot space usage: 32564994048  (  30.33 GB )

[grid@cloud ~]$ acfsutil info file /myacfs/.ACFS/snaps/S1/file.tmp -u |
                awk '/(Yes|No)$/{ if ( $7!=l ) o=$1 ; s[o]=s[o]+$2 ; i[o]=$7 ; l=$7} END{ for (o in s) printf "offset: %4dGB size: %4dGB Inherited: %3s\n",o/1024/1024/1024,s[o]/1024/1024/1024,i[o]}' | sort
offset:    0GB size:   10GB Inherited:  No
offset:   10GB size:   31GB Inherited: Yes

I have aggregated, with awk, the range of snapshot extents that are inherited from the parent. The first 10GB are those that I modified in the snapshot. The others, above 10GB are from the S1 snapshot.


[grid@cloud ~]$ acfsutil info file /myacfs/file.tmp -u | 
                awk '/(Yes|No)$/{ if ( $7!=l ) o=$1 ; s[o]=s[o]+$2 ; i[o]=$7 ; l=$7} END{ for (o in s) printf "offset: %4dGB size: %4dGB Inherited: %3s %s\n",o/1024/1024/1024,s[o]/1024/1024/1024,i[o],x[o]}' | sort
offset:    0GB size:   10GB Inherited:  No
offset:   10GB size:    9GB Inherited: Yes
offset:   19GB size:   10GB Inherited:  No
offset:   30GB size:   11GB Inherited: Yes

Here in the base snapshot I see as inherited, between parent and child, the two ranges that I modified: 10GB from offset 0GB and 10 GB from offset 20GB

That’s the important point: when you take a snapshot, the modifications on the parent, as well as the modifications in the snapshot, will allocate new extents to add to the snapshot.

Why is this important?

This explains why, on ODA, all non-CDB databases are created on a snapshot that has been taken when the filesystem was empty.

Read-write snapshots are really cool, especially with multitenant as they are automated with the CRATE PLUGGABLE DATABASE … SNAPSHOT COPY. But you must keep in mind that the snapshot is made at filesystem level (this may change in 20c with File-Based Snapshots). Any change to the master will allocate space, whether used or not in the snapshot copy.

I blogged about that in the past: https://blog.dbi-services.com/multitenant-thin-provisioning-pdb-snapshots-on-acfs/

Here I just wanted to clarify what you see with “ls” and “du” vs. “df” or “acfsutil info”. “ls” and “du” show the virtual size of the files. “df” shows the extents allocated in the filesystem as base extents or snapshot copies. “acfsutil info” shows those extents allocated as “storage added to snapshot” whether they were allocated for modifications on the parent or child.


[grid@cloud ~]$ acfsutil info file /myacfs/.ACFS/snaps/S1/file.tmp -u
/myacfs/.ACFS/snaps/S1/file.tmp

    flags:        File
    inode:        18014398509482026
    owner:        grid
    group:        oinstall
    size:         45097156608  (  42.00 GB )
    allocated:    45105545216  (  42.01 GB )
    hardlinks:    1
    device index: 1
    major, minor: 248,9731
    create time:  Fri Jul  3 07:36:45 2020
    access time:  Fri Jul  3 07:36:45 2020
    modify time:  Fri Jul  3 09:18:25 2020
    change time:  Fri Jul  3 09:18:25 2020
    extents:
        -----offset ----length | -dev --------offset | inherited
                  0  109051904 |    1    67511517184 | No
          109051904  134217728 |    1    67620569088 | No
          243269632  134217728 |    1    67754786816 | No
          377487360  134217728 |    1    67889004544 | No
          511705088  134217728 |    1    68023222272 | No
          645922816  134217728 |    1    68157440000 | No
          780140544  134217728 |    1    68291657728 | No
          914358272  134217728 |    1    68425875456 | No
         1048576000  134217728 |    1    68560093184 | No
         1182793728  134217728 |    1    68694310912 | No
...

The difference between “du” and “df”, which is also the “storage added to snapshot” displayed by “acfsutil snap info”, is what you see as “inherited”=No in “acfsutil info file -u” on all files, parent and child. Note that I didn’t use compression here, which is another reason for the difference between “du” and “df”.

Cet article Oracle ACFS: “du” vs. “df” and “acfsutil info” est apparu en premier sur Blog dbi services.

Microsoft Teams, Power Automate and Planner

Fri, 2020-07-03 01:48

As we moved to Teams at dbi and one of our customers, automating tasks was a strong need.
One of the tasks that could be automated is the creation of tasks in Microsoft Planner which does not support recurring tasks. Every 4 weeks, we are patching the environment. For everyone awareness and follow up, we need to publish these dates in the planning. As the pace of these tasks is known, it is a bit repetitive to create these tasks manually.

Microsoft Power Automate (previously known as Flow) is powerful tool to automate tasks and it integrate natively with Teams. My idea was to create 4 tasks just by providing the first date and then creating all of them on a 7 days interval.

Flows Overview
  1. Initialize indice to 1. We will use it to iterate
  2. Initialize a First Day of Cycle object based on user input
  3. Initialize an ArrayOfEnv which will be used for naming each task. For example, array can contain:
    1. Sandbox
    2. Dev
    3. Test
    4. Prod
  4. Then create a loop and iterate until we reach 4
    1. Creation of the task with
      • Title based on ArrayOfEnv[indice] content
      • Date based on selected First Day of Cycle + 7 x indice
    2. Increment indice

That’s it! very basic, but can save time.

Formulas

One of the main difficulties was to find documentation on formulas that work.
For example, the date selected will be formatted with a specific format, but AddDays function do not support it well if you don’t provide expected output format. Solution is to force formatting of the date selected and have it the same across whole flow.

Main functions used for this flow:

  • To get value of a variable, I used variables function with variable name as parameter.
  • Add to obviously sum up all parameters provided.
  • Mul to multiply.
  • Creation of an array with createArray and the name I would like for each task:
    createArray('Sandbox','Dev','Test','Prod')
Flows Details

First part of it is the manual trigger with date as input:

Second part is initialization of variables:

Finally, the loop:

Formulas:

  • Title:
    concat(variables('ArrayOfEnv')[variables('Indice')],' ',' Patching')
  • Start and end date Time:
    addDays(variables('SelectedDate'),mul(variables('indice'),7),'yyyy-MM-dd')
Next Steps

For now, it is only a manual trigger of the flow. Next step would be to schedule that at each cycle beginning.

Power Automate looks like a promising tool, but Microsoft must work more on it to be able to have a full text mode as mix of click and formulas makes it impossible to export.

Cet article Microsoft Teams, Power Automate and Planner est apparu en premier sur Blog dbi services.

Oracle Data Guard RedoRoutes : What is Priority 8 ?

Wed, 2020-07-01 09:59

When dealing with cascading or far sync in a Data Guard environment, it is important to understand how to configure the RedoRoutes property.
By default, a primary database sends redo to each transport destination that is configured in the destination. We can create more complex transport topology, depending of our environment, using the RedoRoutes property.
Basically the RedoRoutes property has this format

(redo_routing_rule_1) [(redo_routing_rule_n)]

Where each routing rule contains a redo source field and a redo destination field separated by a colon:

(redo source : redo destination)

One can have more information in Oracle documentation

In this blog I am trying to simply explain how to configure the RedoRoutes property in a Data Guard environment with Far Sync Instance. See my previous blog for far sync instance creation.

I am using Oracle 20c.

The first configuration we consider is the following one

We have
1 primary database: prod20_site1
2 standby databases: prod20_site2 and prod20_site4
1 far sync instance fs_site3

For far sync creation with Oracle 20c see my previous blog

Below the status of the broker configuration

DGMGRL> show configuration

Configuration - prod20

  Protection Mode: MaxAvailability
  Members:
  prod20_site1 - Primary database
    prod20_site2 - Physical standby database
    prod20_site4 - Physical standby database
    fs_site3     - Far sync instance

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 47 seconds ago)

Actually, there is no configured RedoRoutes

DGMGRL> show database prod20_site1 redoroutes;
  RedoRoutes = ''
DGMGRL> show database prod20_site2 redoroutes;
  RedoRoutes = ''
DGMGRL> show database prod20_site4 redoroutes;
  RedoRoutes = ''
DGMGRL> 

For this configuration I want the primary database to send the redo according following rules

prod20_site2 will receive redo directly from prod20_site1
prod20_site1 =====> prod20_site2

prod20_site4 will receive redo via fs_site3 which will forward redo to prod20_site4
prod20_site1 =====> fs_site3 =====> prod20_site4

and if fs_site3 is not available, prod20_site4 will receive directly redo from prod20_site1
prod20_site1 =====> prod20_site4

For this we have to first edit the primary database RedoRoutes property like

DGMGRL> edit database prod20_site1 set property redoroutes='(local:prod20_site2,(fs_site3 priority=1,prod20_site4 priority=2))';
Property "redoroutes" updated

In this rule we have these meanings

local:prod20_site2: if prod20_site1 is the primary database then redo will be sent to prod20_site2

local: (fs_site3 priority=1,prod20_site4 priority=2 ): if prod20_site1 is the primary database then redo will be sent to fs_site3 or to prod20_site4. As the priority of the fs_site3 is higher, indeed smaller priority numbers mean higher priority, redo will be sent first to fs_site3, and if fs_site3 is unavailable, changes will be sent to prod20_site4.
Just note that as fs_site3 has a higher priority, if fs_site3 becomes available, redo will be again sent to fs_site3.

And then we have to tell to fs_site3 to forward redo received from prod20_site1 to prod20_site4.

DGMGRL> edit far_sync fs_site3 set property redoroutes='(prod20_site1:prod20_site4 ASYNC)';
Property "redoroutes" updated

Below the redoroutes we have configured for prod20_site1 and fs_site3

DGMGRL> show database prod20_site1 redoroutes;
  RedoRoutes = '(local:prod20_site2,(fs_site3 priority=1,prod20_site4 priority=2))'
DGMGRL> show database prod20_site2 redoroutes;
  RedoRoutes = ''
DGMGRL> show database prod20_site4 redoroutes;
  RedoRoutes = ''
DGMGRL> show far_sync  fs_site3 redoroutes;
  RedoRoutes = '(prod20_site1:prod20_site4 ASYNC)'
DGMGRL>

And we can verify the status of our configuration

DGMGRL> show configuration verbose


Configuration - prod20

  Protection Mode: MaxPerformance
  Members:
  prod20_site1 - Primary database
    prod20_site2 - Physical standby database
    fs_site3     - Far sync instance
      prod20_site4 - Physical standby database
    prod20_site4 - Physical standby database (alternate of fs_site3)
…
…
Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS

DGMGRL>

Let’s now consider this configuration where we have two far syn instances. As in the first configuration, we want to send first the redo to far sync instances if possible, otherwise redo will be send directly to standby databases

The RedoRoutes property of the primary can be configured as below

DGMGRL> edit database prod20_site1 set property redoroutes='(local:(fs_site5 priority=1,prod20_site2 priority=2),(fs_site3 priority=1,prod20_site4 priority=2))';
Warning: ORA-16677: Standby database has the same or higher priority than other members specified in the RedoRoutes group.

Property "redoroutes" updated
DGMGRL>

And the redoroutes for the far sysnc fs_site5 can be adjusted like

DGMGRL> edit far_sync fs_site5 set property redoroutes='(prod20_site1:prod20_site2 ASYNC)';
Property "redoroutes" updated
DGMGRL>

We can then verify the satus of the configuration

DGMGRL> show configuration verbose

Configuration - prod20

  Protection Mode: MaxPerformance
  Members:
  prod20_site1 - Primary database
    fs_site5     - Far sync instance
      prod20_site2 - Physical standby database
    prod20_site2 - Physical standby database (alternate of fs_site5)
    fs_site3     - Far sync instance
      prod20_site4 - Physical standby database
    prod20_site4 - Physical standby database (alternate of fs_site3)

…
…

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS

DGMGRL>

As we can see when configuring RedoRoutes, we sometimes have to deal with the property PRIORITY.
This property can have a value between 1 and 8. 1 as the highest priority and 8 the lowest priority.
Let’s consider two destination A and B in the same group

Case 1: A and B with the same prority

Redo will be sent to A or B, let’s say A. When A is unavailable, redo will be then sent to B. And when A become reachable again, redo will continue to be sent to B.

(local:(A priority=1,B priority=1))

Case 2: A has a higher priority then B
Redo will be sent to A. If A becomes unavailable, redo will be sent to B. And if A becomes again reachable, redo will be sent to A as it has the highest priority

(local:(A priority=1,B priority=2))

But sometimes in the same group, we may want to send redo to both members. For example if we consider the following configuration, we just want that redo will be sent to fs_site3 if possible and if fs_site3 is not reachable then changes will be sent to both prod20_site2 et prod20_site4.

In this case we can use the PRIORITY 8 which has a special meaning. If the primary sends redo to a member with PRIORITY 8, then it must also send these redo to each member with the PRIORITY 8 in the group

In the configuration above, we want following rules

prod20_site1 will send changes to fs_site3 which will forward to prod20_site2 and prod20_site4 and if fs_site3 is not avalaible, prod20_site1 will ship redo to both standby databases.

And when fs_site3 becomes again available, redo will be send again to fs_site3

The redoRoutes for the primary database can be like

DGMGRL> edit database prod20_site1 set property redoroutes='(local:(fs_site3 priority=1,prod20_site2 priority=8,prod20_site4 priority=8))';
Warning: ORA-16677: Standby database has the same or higher priority than other members specified in the RedoRoutes group.

Property "redoroutes" updated
DGMGRL>

And for the far sync instance

DGMGRL> edit far_sync fs_site3 set property redoroutes='(prod20_site1:prod20_site2 ASYNC,prod20_site4 ASYNC)';
Property "redoroutes" updated
DGMGRL>

The status of the configuration

DGMGRL> show configuration verbose

Configuration - prod20

  Protection Mode: MaxPerformance
  Members:
  prod20_site1 - Primary database
    fs_site3     - Far sync instance
      prod20_site2 - Physical standby database
      prod20_site4 - Physical standby database
    prod20_site2 - Physical standby database (alternate of fs_site3)
    prod20_site4 - Physical standby database (alternate of fs_site3)
…
…

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS

DGMGRL>
Conclusion

Depending to the configuration, the redo transport topology can be very complex. What I can recommend when dealing with far sync instances, is to think about all possible cases, including switchover and failover. And based of all possible cases to design an architecture for the redo transport. In this blog we just consider the case when prod20_site1 is the primary.

Cet article Oracle Data Guard RedoRoutes : What is Priority 8 ? est apparu en premier sur Blog dbi services.

Oracle Autonomous Linux: cron’d ksplice and yum updates

Tue, 2020-06-30 16:14
By Franck Pachot

.
Oracle Enterprise Linux (OEL) is a Linux distribution which is binary compatible with Red Hat Enterprise Linux (RHEL). However, unlike RHEL, OEL is open source, free to download, free to use, free to distribute, free to update and gets free bug fixes. And there are more frequent updates in OEL than in CentOS, the free base of RHEL. You can pay a subscription for additional support and features (like Ksplice or Dtrace) in OEL. It can run the same kernel as RHEL but also provides, still for free, the ‘unbreakable kernel’ (UEK) which is still compatible with RHEL but enhanced with optimizations, recommended especially when running Oracle products.

This is not new and I didn’t resist to illustrate the previous paragraph with the animated gif from the years of this UEK arrival. What is new is that OEL is also the base for the new Autonomous Linux which can run in the Oracle Cloud, automates Ksplice for updating the system online, without restart, and sending notifications about these updates. You can use it in the Oracle Cloud Free Tier.

When creating an Always Free compute instance you select the Oracle Autonomous Linux image. I’ve summarized all steps there:

Autonomous Linux image

Generate an API private key

[opc@al ~]$ mkdir ~/.oci
[opc@al ~]$ openssl genrsa -out ~/.oci/oci_api_key.pem 2048 # no passphrase
[opc@al ~]$ chmod go-rwx ~/.oci/oci_api_key.pem
[opc@al ~]$ openssl rsa -pubout -in ~/.oci/oci_api_key.pem -out ~/.oci/oci_api_key_public.pem
writing RSA key

This generates an API key temporarily.

Configure OCI CLI profile

[opc@al ~]$ sudo al-config -u ocid1.user.oc1..aaaaaaaafo2liscovfefeubflxm2rswrzpnnmosue4lczmgaaodwtqdljj3q -t ocid1.tenancy.oc1..aaaaaaaazlv5wxkdhldyvxkkta7rjn25ocovfefexhidte5zgiyauut2i2gq -k ~/.oci/oci_api_key.pem
Configured OCI CLI profile.
Please delete /home/opc/.oci/oci_api_key.pem

This configures the OCI CLI profile for my user (ocid1.user.oc1.. is my user OCID which I find in Oracle Cloud » Governance and Administration » Identity » Users » Users Detail » OCID copy) in my tenancy (ocid1.tenancy.oc1.. is my tenancy OCI I find in Oracle Cloud » Governance and Administration » Administration » Tenancy Details » OCID copy).

Notifications

When something happens autonomously you want to be notified for it. This uses the OCI notification service with a topic you subscribe to. This is also available in the Free Tier. The topic is created with Oracle Cloud » Application Integration » Notifications » Create Topic, where you just put a name and a description and get the OCID for it (ocid1.onstopic.oc1.eu-frankfurt-1… for me).

While in the console, on this topic I’ve created a subscription where I have put my e-mail address. I’ll receive by e-mail all notifications sent to this topic.

Configure OCI notification service topic OCID

[opc@al ~]$ sudo al-config -T ocid1.onstopic.oc1.eu-frankfurt-1.aaaaaaaaamo7khj3xab6oec5xtcovfefeokqszapwsafeje6g6ltlnhd363a
Configured OCI notification service topic OCID.
Publishing message 'AL: Notification enabled on instance AL'
Published message 'AL: Notification enabled on instance AL'

In the Autonomous Linux instance I’ve setup the OCI notification service topic OCID. And that’s all.

Check your e-mails, you have to acknowledge the reception of notifications of course.

Kernel version

[opc@al ~]$ uname -a
Linux al 4.14.35-1902.301.1.el7uek.x86_64 #2 SMP Tue Mar 31 16:50:32 PDT 2020 x86_64 x86_64 x86_64 GNU/Linux

Here is the kernel version that has been installed


[opc@al ~]$ sudo uptrack-uname -a
Linux al 4.14.35-1902.302.2.el7uek.x86_64 #2 SMP Fri Apr 24 14:24:11 PDT 2020 x86_64 x86_64 x86_64 GNU/Linux

This is the effective kernel updated with Ksplice


[opc@al ~]$ sudo uptrack-show
Installed updates:
[cp1p7rl5] Known exploit detection.
[3kfqruxl] Known exploit detection for CVE-2017-7308.
[6vy9wlov] Known exploit detection for CVE-2018-14634.
[r8wncd28] KPTI enablement for Ksplice.
[3e9je971] Known exploit detection for CVE-2018-18445.
[20bmudk6] Out-of-bounds access when classifying network packets with traffic control index.
[oy5cke5u] NULL dereference while writing Hyper-V SINT14 MSR.
[5jsm8lzj] CVE-2020-9383: Information leak in floppy disk driver.
[5p7yd05e] NULL pointer dereference when initializing Differentiated Services marker driver.
[sajmv0xh] CVE-2018-19854: Information leak in cryptography socket NETLINK_CRYPTO call.
[1gefn4lp] CVE-2019-19965: Denial-of-service in SCSI device removal.
[6hu77eez] Invalid memory access when sending an excessively large packet using Segmentation Offloads.
[f0zxddhg] Livelock in loop device block resize operation.
[2lgm3hz9] CVE-2019-14814, CVE-2019-14815, CVE-2019-14816: Denial-of-service when parsing access point settings in Marvell WiFi-Ex driver.
[3yqxyw42] CVE-2019-20096: Memory leak while changing DCCP socket SP feature values.
[9g5kf79r] Improved fix for CVE-2020-2732: Privilege escalation in Intel KVM nested emulation.
[bq9hiiuj] Race condition in ipoib during high request load causes denial-of-service.
[3youemoz] CVE-2020-11494: Information leak in serial line CAN device communication.
[jpbi3wnm] Use-after-free when removing generic block device.
[if1ety6t] Memory corruption when reading EFI sysfs entries.
[iv8r17d8] CVE-2020-8648: Use-after-free in virtual terminal selection buffer.
[mojwd0zk] Various Spectre-V1 information leaks in KVM.
[nvi6r5wx] CVE-2019-19527: Denial-of-service in USB HID device open.
[o3df6mds] CVE-2020-8647, CVE-2020-8649: Use-after-free in the VGA text console driver.
[kjyqg48a] CVE-2019-19532: Denial-of-service when initializing HID devices.
[74j9dhee] Divide-by-zero when CPU capacity changes causes denial-of-service.
[lgsoxuy7] CVE-2019-19768: Use-after-free when reporting an IO trace.

Effective kernel version is 4.14.35-1902.302.2.el7uek

all details are there about the fixes applied by Ksplice, without any reboot.

One month later

I’ve created that on May 23th, 2020 and writing this one month later.

Here are the e-mails I’ve received from the topic subscription:

And my current machine state:


[opc@al ~]$ uptime
 19:26:39 up 38 days, 13:49,  2 users,  load average: 0.07, 0.02, 0.00
[opc@al ~]$ uname -a
Linux al 4.14.35-1902.301.1.el7uek.x86_64 #2 SMP Tue Mar 31 16:50:32 PDT 2020 x86_64 x86_64 x86_64 GNU/Linux
[opc@al ~]$ sudo uptrack-uname -a
Linux al 4.14.35-1902.303.4.1.el7uek.x86_64 #2 SMP Fri May 29 14:56:41 PDT 2020 x86_64 x86_64 x86_64 GNU/Linux
[opc@al ~]$

The VM has been running 24/7 without outage and the effective kernel is now higher than when installed.

Ksplice updates

This effective kernel has been updated on Tue Jun 16 08:04:33 GMT 2020 as reported by this e-mail I received:


noreply@notification.eu-frankfurt-1.oraclecloud.com
Jun 16, 2020, 10:04 AM
to AutonomousLinux

+------------------------------------------------------------------------+
|  Summary (Tue Jun 16 08:04:33 GMT 2020)                                |
+------------------------------------------------------------------------+
Ksplice updates installed: yes
Yum updates installed: no
Uptime: 08:04:33 up 24 days,  2:27,  0 users,  load average: 0.72, 0.20, 0.06
+------------------------------------------------------------------------+
|  Ksplice upgrade report                                                |
+------------------------------------------------------------------------+
Running 'ksplice -y all upgrade'.
Updating on-disk packages for new processes
Loaded plugins: langpacks
No packages marked for update
Nothing to do.
The following steps will be taken:
Install [i622mubr] Information leak in KVM_HC_CLOCK_PAIRING hypercall.
Install [35xnb9pi] CVE-2019-9500: Potential heap overflow in Broadcom FullMAC WLAN driver.
Install [ppqwl5uh] CVE-2019-15505: Out-of-bounds access in Technisat DVB-S/S2 USB2.0 driver.
Install [ctobm6wo] CVE-2019-19767: Use-after-free in with malformed ext4 filesystems.
Install [l5so0kqe] CVE-2019-19056, CVE-2019-19057: Denial-of-service in the Marvell mwifiex PCIe driver.
Install [b4iszmv7] CVE-2019-20636: Out-of-bounds write via crafted keycode table.
Install [5oec4s3n] Denial-of-service when mounting an ocfs2 filesystem.
Install [rafq9pe9] CVE-2019-9503: Denial-of-service when receiving firmware event frames over a Broadcom WLAN USB dongle.
Install [nlpu7kxi] Denial-of-service when initializing a serial CAN device.
Install [lnz9di5t] CVE-2020-11608: NULL pointer dereference when initializing USB GSPCA based webcams.
Install [2bodr9yk] CVE-2019-19537: Denial-of-service in USB character device registration.
Install [9iw2y1wn] CVE-2019-19524: Use-after-free when unregistering memoryless force-feedback driver.
Install [h5s7eh41] CVE-2020-11609: NULL pointer dereference when initializing STV06XX USB Camera device.
Install [behlqry8] Denial-of-service via invalid TSC values in KVM.
Install [onllaobw] CVE-2019-12819: Use-after-free during initialization of MDIO bus driver.
Install [fdn63bdc] CVE-2019-11599: Information leak in the coredump implementation.
Install [kb3b03z9] CVE-2019-19058: Denial-of-service in iwlwifi firmware interface.
Install [mgfi6p6r] Use-after-free when writing to SLIP serial line.
Install [hs2h9j8w] CVE-2019-14896, CVE-2019-14897: Denial-of-service when parsing BSS in Marvell 8xxx Libertas WLAN driver.
Install [bb9sd52m] CVE-2020-11668: NULL pointer dereference when initializing Xirlink C-It USB camera device.
Install [p4ygwgyj] Information leak in KVM's VMX operation path.
Install [1uxt1xo6] NFSv4 client fails to correctly renew lease when using fsinfo.
Install [hjoeh3zi] CVE-2020-0543: Side-channel information leak using SRBDS.
Installing [i622mubr] Information leak in KVM_HC_CLOCK_PAIRING hypercall.
Installing [35xnb9pi] CVE-2019-9500: Potential heap overflow in Broadcom FullMAC WLAN driver.
Installing [ppqwl5uh] CVE-2019-15505: Out-of-bounds access in Technisat DVB-S/S2 USB2.0 driver.
Installing [ctobm6wo] CVE-2019-19767: Use-after-free in with malformed ext4 filesystems.
Installing [l5so0kqe] CVE-2019-19056, CVE-2019-19057: Denial-of-service in the Marvell mwifiex PCIe driver.
Installing [b4iszmv7] CVE-2019-20636: Out-of-bounds write via crafted keycode table.
Installing [5oec4s3n] Denial-of-service when mounting an ocfs2 filesystem.
Installing [rafq9pe9] CVE-2019-9503: Denial-of-service when receiving firmware event frames over a Broadcom WLAN USB dongle.
Installing [nlpu7kxi] Denial-of-service when initializing a serial CAN device.
Installing [lnz9di5t] CVE-2020-11608: NULL pointer dereference when initializing USB GSPCA based webcams.
Installing [2bodr9yk] CVE-2019-19537: Denial-of-service in USB character device registration.
Installing [9iw2y1wn] CVE-2019-19524: Use-after-free when unregistering memoryless force-feedback driver.
Installing [h5s7eh41] CVE-2020-11609: NULL pointer dereference when initializing STV06XX USB Camera device.
Installing [behlqry8] Denial-of-service via invalid TSC values in KVM.
Installing [onllaobw] CVE-2019-12819: Use-after-free during initialization of MDIO bus driver.
Installing [fdn63bdc] CVE-2019-11599: Information leak in the coredump implementation.
Installing [kb3b03z9] CVE-2019-19058: Denial-of-service in iwlwifi firmware interface.
Installing [mgfi6p6r] Use-after-free when writing to SLIP serial line.
Installing [hs2h9j8w] CVE-2019-14896, CVE-2019-14897: Denial-of-service when parsing BSS in Marvell 8xxx Libertas WLAN driver.
Installing [bb9sd52m] CVE-2020-11668: NULL pointer dereference when initializing Xirlink C-It USB camera device.
Installing [p4ygwgyj] Information leak in KVM's VMX operation path.
Installing [1uxt1xo6] NFSv4 client fails to correctly renew lease when using fsinfo.
Installing [hjoeh3zi] CVE-2020-0543: Side-channel information leak using SRBDS.
Your kernel is fully up to date.
Effective kernel version is 4.14.35-1902.303.4.1.el7uek
+------------------------------------------------------------------------+
|  Yum upgrade report                                                    |
+------------------------------------------------------------------------+
Running 'yum-cron' with update cmd: default.
+------------------------------------------------------------------------+
|  Ksplice updates status                                                |
+------------------------------------------------------------------------+
Running 'ksplice all show'.
Ksplice user-space updates:
No Ksplice user-space updates installed

Ksplice kernel updates:
Installed updates:
[cp1p7rl5] Known exploit detection.
[3kfqruxl] Known exploit detection for CVE-2017-7308.
[6vy9wlov] Known exploit detection for CVE-2018-14634.
[r8wncd28] KPTI enablement for Ksplice.
[3e9je971] Known exploit detection for CVE-2018-18445.
[20bmudk6] Out-of-bounds access when classifying network packets with traffic control index.
[oy5cke5u] NULL dereference while writing Hyper-V SINT14 MSR.
[5jsm8lzj] CVE-2020-9383: Information leak in floppy disk driver.
[5p7yd05e] NULL pointer dereference when initializing Differentiated Services marker driver.
[sajmv0xh] CVE-2018-19854: Information leak in cryptography socket NETLINK_CRYPTO call.
[1gefn4lp] CVE-2019-19965: Denial-of-service in SCSI device removal.
[6hu77eez] Invalid memory access when sending an excessively large packet using Segmentation Offloads.
[f0zxddhg] Livelock in loop device block resize operation.
[2lgm3hz9] CVE-2019-14814, CVE-2019-14815, CVE-2019-14816: Denial-of-service when parsing access point settings in Marvell WiFi-Ex driver.
[3yqxyw42] CVE-2019-20096: Memory leak while changing DCCP socket SP feature values.
[9g5kf79r] Improved fix for CVE-2020-2732: Privilege escalation in Intel KVM nested emulation.
[bq9hiiuj] Race condition in ipoib during high request load causes denial-of-service.
[3youemoz] CVE-2020-11494: Information leak in serial line CAN device communication.
[jpbi3wnm] Use-after-free when removing generic block device.
[if1ety6t] Memory corruption when reading EFI sysfs entries.
[iv8r17d8] CVE-2020-8648: Use-after-free in virtual terminal selection buffer.
[mojwd0zk] Various Spectre-V1 information leaks in KVM.
[nvi6r5wx] CVE-2019-19527: Denial-of-service in USB HID device open.
[o3df6mds] CVE-2020-8647, CVE-2020-8649: Use-after-free in the VGA text console driver.
[kjyqg48a] CVE-2019-19532: Denial-of-service when initializing HID devices.
[74j9dhee] Divide-by-zero when CPU capacity changes causes denial-of-service.
[lgsoxuy7] CVE-2019-19768: Use-after-free when reporting an IO trace.
[i622mubr] Information leak in KVM_HC_CLOCK_PAIRING hypercall.
[35xnb9pi] CVE-2019-9500: Potential heap overflow in Broadcom FullMAC WLAN driver.
[ppqwl5uh] CVE-2019-15505: Out-of-bounds access in Technisat DVB-S/S2 USB2.0 driver.
[ctobm6wo] CVE-2019-19767: Use-after-free in with malformed ext4 filesystems.
[l5so0kqe] CVE-2019-19056, CVE-2019-19057: Denial-of-service in the Marvell mwifiex PCIe driver.
[b4iszmv7] CVE-2019-20636: Out-of-bounds write via crafted keycode table.
[5oec4s3n] Denial-of-service when mounting an ocfs2 filesystem.
[rafq9pe9] CVE-2019-9503: Denial-of-service when receiving firmware event frames over a Broadcom WLAN USB dongle.
[nlpu7kxi] Denial-of-service when initializing a serial CAN device.
[lnz9di5t] CVE-2020-11608: NULL pointer dereference when initializing USB GSPCA based webcams.
[2bodr9yk] CVE-2019-19537: Denial-of-service in USB character device registration.
[9iw2y1wn] CVE-2019-19524: Use-after-free when unregistering memoryless force-feedback driver.
[h5s7eh41] CVE-2020-11609: NULL pointer dereference when initializing STV06XX USB Camera device.
[behlqry8] Denial-of-service via invalid TSC values in KVM.
[onllaobw] CVE-2019-12819: Use-after-free during initialization of MDIO bus driver.
[fdn63bdc] CVE-2019-11599: Information leak in the coredump implementation.
[kb3b03z9] CVE-2019-19058: Denial-of-service in iwlwifi firmware interface.
[mgfi6p6r] Use-after-free when writing to SLIP serial line.
[hs2h9j8w] CVE-2019-14896, CVE-2019-14897: Denial-of-service when parsing BSS in Marvell 8xxx Libertas WLAN driver.
[bb9sd52m] CVE-2020-11668: NULL pointer dereference when initializing Xirlink C-It USB camera device.
[p4ygwgyj] Information leak in KVM's VMX operation path.
[1uxt1xo6] NFSv4 client fails to correctly renew lease when using fsinfo.
[hjoeh3zi] CVE-2020-0543: Side-channel information leak using SRBDS.

Effective kernel version is 4.14.35-1902.303.4.1.el7uek

--
You are receiving notifications as a subscriber to the topic: AL (Topic OCID: ocid1.onstopic.oc1.eu-frankfurt-1.aaaaaaaaamo7khj3xab6oec5xt5c7ia6eokqszapwsafeje6g6ltlnhd363a). To stop receiving notifications from this topic, unsubscribe.

Please do not reply directly to this email. If you have any questions or comments regarding this email, contact your account administrator.
Ksplice updates

I’ve also seen a notification about failed updates:


+------------------------------------------------------------------------+
|  Summary (Mon Jun 29 08:03:19 GMT 2020)                                |
+------------------------------------------------------------------------+
Ksplice updates installed: failed
Yum updates installed: no
Uptime: 08:03:19 up 37 days,  2:25,  0 users,  load average: 0.31, 0.08, 0.03
+------------------------------------------------------------------------+
|  Ksplice upgrade report                                                |
+------------------------------------------------------------------------+
Running 'ksplice -y all upgrade'.
Updating on-disk packages for new processes
Loaded plugins: langpacks
No packages marked for update
Nothing to do.
Unexpected error communicating with the Ksplice Uptrack server. Please
check your network connection and try again. If this error re-occurs,
e-mail ksplice-support_ww@oracle.com.

(Network error: TCP connection reset by peer)

Ok, network error at that time.
However, the next run was ok:


+------------------------------------------------------------------------+
|  Summary (Tue Jun 30 08:03:13 GMT 2020)                                |
+------------------------------------------------------------------------+
Ksplice updates installed: no
Yum updates installed: no
Uptime: 08:03:13 up 38 days,  2:25,  1 user,  load average: 0.00, 0.00, 0.00

and I can confirm by running manually:


[opc@al ~]$ ksplice -y all upgrade
Error: failed to configure the logger
[opc@al ~]$ sudo ksplice -y all upgrade
Updating on-disk packages for new processes
Loaded plugins: langpacks
ol7_x86_64_userspace_ksplice                                                                                                                     | 2.8 kB  00:00:00
No packages marked for update
100% |################################################################################################################################################################|
Nothing to do.
Nothing to be done.
Your kernel is fully up to date.
Effective kernel version is 4.14.35-1902.303.4.1.el7uek

Ksplice is about the kernel and some user space libraries such as glibc and openssl.
But Autonomous Linux also updates the packages.

Yum updates

In addition to kernel patches, the packages are also updated:


The following updates will be applied on al:
================================================================================
 Package                  Arch    Version                  Repository      Size
================================================================================
Installing:
 kernel                   x86_64  3.10.0-1127.13.1.el7     al7             50 M
Updating:
 bpftool                  x86_64  3.10.0-1127.13.1.el7     al7            8.4 M
 ca-certificates          noarch  2020.2.41-70.0.el7_8     al7            382 k
 kernel-tools             x86_64  3.10.0-1127.13.1.el7     al7            8.0 M
 kernel-tools-libs        x86_64  3.10.0-1127.13.1.el7     al7            8.0 M
 libgudev1                x86_64  219-73.0.1.el7_8.8       al7            107 k
 microcode_ctl            x86_64  2:2.1-61.10.0.1.el7_8    al7            2.7 M
 ntpdate                  x86_64  4.2.6p5-29.0.1.el7_8.2   al7             86 k
 python-perf              x86_64  3.10.0-1127.13.1.el7     al7            8.0 M
 python36-oci-cli         noarch  2.12.0-1.el7             al7            4.4 M
 python36-oci-sdk         x86_64  2.17.0-1.el7             al7             10 M
 rsyslog                  x86_64  8.24.0-52.el7_8.2        al7            620 k
 selinux-policy           noarch  3.13.1-266.0.3.el7_8.1   al7            497 k
 selinux-policy-targeted  noarch  3.13.1-266.0.3.el7_8.1   al7            7.2 M
 systemd                  x86_64  219-73.0.1.el7_8.8       al7            5.1 M
 systemd-libs             x86_64  219-73.0.1.el7_8.8       al7            416 k
 systemd-python           x86_64  219-73.0.1.el7_8.8       al7            143 k
 systemd-sysv             x86_64  219-73.0.1.el7_8.8       al7             95 k
Removing:
 kernel                   x86_64  3.10.0-1127.el7          @anaconda/7.8   64 M

Transaction Summary
================================================================================
Install   1 Package
Upgrade  17 Packages
Remove    1 Package
The updates were successfully applied

All packages are maintained up-to-date without human intervention and without downtime.

Package repository

The package repository is limited:


[opc@al ~]$ yum repolist
Loaded plugins: langpacks
ol7_x86_64_userspace_ksplice/primary_db                                                                                                          | 193 kB  00:00:00
repo id                                                       repo name                                                                                           status
!al7/x86_64                                                   Autonomous Linux 7Server (x86_64)                                                                   3,392
ol7_x86_64_userspace_ksplice                                  Ksplice aware userspace packages for Oracle Linux 7Server (x86_64)                                    438
repolist: 3,830
[opc@al ~]$ yum list all | wc -l
1462

1462 packages in one repo.
As a comparison, here is an Oracle Enterprise Linux image:


[opc@ol ~]$ yum repolist
Loaded plugins: langpacks, ulninfo
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
repo id                                                  repo name                                                                                                status
!ol7_UEKR5/x86_64                                        Latest Unbreakable Enterprise Kernel Release 5 for Oracle Linux 7Server (x86_64)                            200
!ol7_addons/x86_64                                       Oracle Linux 7Server Add ons (x86_64)                                                                       421
!ol7_developer/x86_64                                    Oracle Linux 7Server Development Packages (x86_64)                                                        1,319
!ol7_developer_EPEL/x86_64                               Oracle Linux 7Server Development Packages (x86_64)                                                       31,78$
!ol7_ksplice                                             Ksplice for Oracle Linux 7Server (x86_64)                                                                 6,41$
!ol7_latest/x86_64                                       Oracle Linux 7Server Latest (x86_64)                                                                     18,86$
!ol7_oci_included/x86_64                                 Oracle Software for OCI users on Oracle Linux 7Server (x86_64)                                              26$
!ol7_optional_latest/x86_64                              Oracle Linux 7Server Optional Latest (x86_64)                                                            13,91$
!ol7_software_collections/x86_64                         Software Collection Library release 3.0 packages for Oracle Linux 7 (x86_64)                             14,47$
repolist: 87,645
[opc@ol ~]$ yum list all | wc -l
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
36720
[opc@ol ~]$

There is a lot more here. Remember that OEL is compatible with RHEL.

If you need more packages you can open a SR and ask to have it added to the Autonomous Linux repository. For example, I use tmux everyday, especially in a free tier VM (see https://blog.dbi-services.com/always-free-always-up-tmux-in-the-oracle-cloud-with-ksplice-updates/).

If you don’t want to ask for it, there’s the possibility to add public-yum-ol7.repo there:


[opc@al ~]$ sudo yum-config-manager --add-repo http://yum.oracle.com/public-yum-ol7.repo
Loaded plugins: langpacks
adding repo from: http://yum.oracle.com/public-yum-ol7.repo
grabbing file http://yum.oracle.com/public-yum-ol7.repo to /etc/yum.repos.d/public-yum-ol7.repo
repo saved to /etc/yum.repos.d/public-yum-ol7.repo

This added the public Oracle Enterprise Linux repository. Is it correct to do that? It depends what you want: the minimum validated by Oracle to be autonomously updated without any problem, or a little additional customization.

And then install the package you want:


[opc@al ~]$ sudo yum install -y tmux

Loaded plugins: langpacks
Resolving Dependencies
--> Running transaction check
---> Package tmux.x86_64 0:1.8-4.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

========================================================================================================================================================================
 Package                              Arch                                   Version                                   Repository                                  Size
========================================================================================================================================================================
Installing:
 tmux                                 x86_64                                 1.8-4.el7                                 ol7_latest                                 241 k

Transaction Summary
========================================================================================================================================================================
Install  1 Package

Total download size: 241 k
Installed size: 554 k
Downloading packages:
tmux-1.8-4.el7.x86_64.rpm                                                                                                                        | 241 kB  00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : tmux-1.8-4.el7.x86_64                                                                                                                                1/1
  Verifying  : tmux-1.8-4.el7.x86_64                                                                                                                                1/1

Installed:
  tmux.x86_64 0:1.8-4.el7

Now the package is installed and will be updated

Autonomous cron

Those updates are scheduled by cron but you change the schedule through the al-config bash script provided:


[opc@al ~]$ sudo al-config -s
Current daily auto update time window(24-hour): 7-11
Current daily auto update time(24-hour): 08:03

This has set a random time during the 7am to 11 am window, which is here 08:03


[opc@al ~]$ cat /etc/cron.d/al-update
# Daily cron job for AL auto updates.
# Created by al-config, do not modify this file.
# If you want to change update time, use
# 'sudo al-config -w ' to set auto update time window
3 8 * * * root /usr/sbin/al-update >/dev/null

That’s the autonomous thing here: you don’t set the crontab job. You just call the al-config with a time window and it sets the crontab for you in a random time within this window.

Let’s play with this:


[opc@al ~]$ sudo al-config -w 0-2
Configured daily auto update time window(24-hour): 0-2
Configured daily auto update time(24-hour): 01:12
Created cron job file /etc/cron.d/al-update .
[opc@al ~]$ sudo al-config -w 0-2
Configured daily auto update time window(24-hour): 0-2
Configured daily auto update time(24-hour): 01:33
Created cron job file /etc/cron.d/al-update .
[opc@al ~]$ sudo al-config -w 0-2
Configured daily auto update time window(24-hour): 0-2
Configured daily auto update time(24-hour): 00:47
Created cron job file /etc/cron.d/al-update .
[opc@al ~]$ sudo al-config -w 0-2
Configured daily auto update time window(24-hour): 0-2
Configured daily auto update time(24-hour): 00:00
Created cron job file /etc/cron.d/al-update .
[opc@al ~]$ sudo al-config -w 0-2
Configured daily auto update time window(24-hour): 0-2
Configured daily auto update time(24-hour): 00:41
Created cron job file /etc/cron.d/al-update .

You see the idea. Very simple. But simple is awesome, right?

What is this scheduled job doing autonomously every day? You see it in the notification e-mail. Basically it runs:


ksplice -y all upgrade
yum-cron
ksplice all show

and sends the output to your e-mail

This is what keeps your Autonomous Linux up-to-date: ksplice, yum, and the output sent to your e-mail through:


Received: by omta-ad1-fd1-102-eu-frankfurt-1.omtaad1.vcndpfra.oraclevcn.com (Oracle Communications Messaging Server 8.1.0.1.20200619 64bit (built Jun 19 2020)) w

This is an excerpt from the notification e-mail headers. “Oracle Communications Messaging Server” is a heritage from Sun which, according to wikipedia, has its roots in Netscape Messaging Server. All those little bricks from years of enterprise IT are nicely wired together to bring this automation known as Autonomous.

Cet article Oracle Autonomous Linux: cron’d ksplice and yum updates est apparu en premier sur Blog dbi services.

Oracle GoldenGate 19c: Cannot register Integrated EXTRACT due to ORA-44004

Mon, 2020-06-29 04:15

The global_name in an oracle database has a direct impact on the Golden Gate Extract process registration we need to do when we create an Integrated Extract.

In my example below, I use Oracle GoldenGate MicroServices architecture but the same behaviour occurs with Oracle GoldenGate Classic architecture.

 

Let’s start with the creation of the Extract process by clicking on the plus button :

 

Choose Integrated Extract and click on Next button:

Fulfill all mandatory fields:

 

Add the list of tables you want to replicate:

 

Click on Create and Run the process. After a few seconds, the following error message appears :

An integrated extract must to be registered into the database and this is the registration process which fails.

By checking into the report file, we see that Logmining server does not exist on the database. This error is a consequence of the extract registration process failure.

 

The problem is Oracle GoldenGate doesn’t support the character “-” in the GLOBAL_NAME:

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
DB1.IT.DBI-SERVICES.COM

SQL> sho parameter db_domain

NAME     TYPE VALUE
------------------------------------ ----------- ------------------------------
db_domain     string it.dbi-services.com

The solution is to modify the db_domain and rename the global_name :

SQL> alter system set db_domain='it.dbiservices.com' scope=spfile;

System altered.


SQL> alter database rename global_name to DB1.ITDBISERVICES.COM;

Database altered.

Let’s now try create the Extract Process:

Now it works :

Let’s check the report file by clicking in Action/details button :

 

The new global_name (DB1.ITDBISERVICES.COM) is now active  and Logminig server is enabled into the database.

Cet article Oracle GoldenGate 19c: Cannot register Integrated EXTRACT due to ORA-44004 est apparu en premier sur Blog dbi services.

Postgres Vision 2020 is still live

Fri, 2020-06-26 04:07

This year the Postgres Vision 2020 conference was a Free Virtual Event that took place on June 23-24, 2020 and the Postgres Vision Theme was “Postgres is for builders”


Experts, IT professionals and Postgres community leaders from all around the world were talking about the future of Postgres, real-world stories, Postgres in the cloud, Postgres containerized but also this year there were also sessions focusing on how developers are building scalable and secure applications.

Introduction

In the following URL, Ed Boyajian, EDB CEO has posted a short video introducing Postgres Vision 2020

Now, once logged in,  let’s get in the virtual Expo Hall

To access sessions you have to enter the theater where all sessions are displayed with their abstract and speaker(s).

My first session was “Power to Postgres”and it started on Tuesday: 23/06/2020 at 9:15am (local time)
I was immediately blown away by the huge number of people connected. It was coming from all over the world.
During this 45mn talk, Ed Boyajian, EDB CEO, was sharing his experience and vision for the Postgres movement along with stories of how organizations are leveraging Postgres to accelerate innovation.

His last words in this monologue were rather reassuring regarding the future of Postgres.
“This is the revolution and Postgres will win it, power to Postgres”

Following are only some of the “live” sessions I attended to and I won’t detailed them as it is possible, once registered to watch them again on demand (click on the registration button).

– “A Look at the Elephants Trunk – PostgreSQL 13” – Speaker : Magnus Hagander
– “What’s New in 3.0 and Coming in PostGIS 3.1?” – Speaker: Regina Obe
– “EDB Postgres Advanced Server – What’s New?” – Speaker: Rushabh Lathia
– “Table Partitioning in Postgres: How Far We’ve Come” – Speaker: Amit Langote

Conclusion

In these difficult times, especially because of Covid-19, this event was really a great success by its organization, by the quality of the speakers and their presentations. I think that in the future, this type of virtual event will become more widespread and allow a greater number of participants.
My only regret is that it does not allow to meet the participants, the speakers between sessions and to discuss with them.

Cet article Postgres Vision 2020 is still live est apparu en premier sur Blog dbi services.

Oracle Cloud basics for beginners

Thu, 2020-06-25 09:26
Introduction

Cloud, Cloud, Cloud. Everyone is talking about the Cloud but a lot of people are still in the fog with Cloud technologies. Let’s talk about basic features of the Oracle Cloud, called OCI for Oracle Cloud Infrastructure.

What is really OCI?

OCI is physically a lot of servers in datacenters all around the world. These servers are not very different from the servers you probably have in your own datacenter. Some of these servers are already in use by the customers, and some are immediately available for existing customers or new customers. Most of the customers will not use complete servers but part of them, thanks to the virtualization layer on OCI. A real server can hold several virtual servers, quite a lot actually. Oracle tells us that there is no overprovisionning on OCI: if you create your own server with 2 CPUs and 64GB of RAM, you’re pretty sure that these resources are available for you on the physical server, even if you don’t plan to use them at full throttle. If you need a complete physical server for yourself, it’s also possible, and it’s easy to provision just like a virtual machine.

What do I need to create a server in OCI?

OCI is actually available through a website, the OCI console, but you’ll have to buy Cloud credits to be able to create resources in this Cloud.

Two other options are available:
– ask your Oracle software provider for free trial Cloud credits for testing OCI
– create a free account and use only always-free resources (quite limitating)

When you’re connected to your brand new OCI account onto the console, just create a compute instance. A compute instance is a server for multi-purpose usage. Several options are available at server creation, like the number of CPUs, the amount of RAM, the size of the boot disk, and the OS that will come pre-installed. Provisionning a simple Linux server takes 2 minutes. Deleting a server is also a matter of minutes.

Can I go straight to server creation?

Not really. You cannot simply create a server, because you’ll need to put this server in a compartment, a kind or virtual container for your servers. So first step is to create a compartment. Compartments are fully isolated between them.

Then, you’ll need a private network (called Virtual Cloud Network or VCN) where to put your server. This private network should be created with care because it cannot overlap your on-premise network, especially if you plan to connect them (you surely need to). With network creation, other basic network components need to be also configured.

What are the basic network resources to configure?

First of all, all these resources are virtual resources in OCI. When configuring your network, you’ll also need at least one subnet from your VCN, a firewall (called security list), a router (route table) and a gateway for connecting this server (NAT gateway for outbound internet connexion or internet gateway for both inbound and outbound connexions).

Your OCI network will be linked to your on-premise network with IPSec VPN technology or FastConnect. This last option being a dedicated connexion to your existing infrastructure that does not go through internet.

So before creating your first server, you’ll need to define and configure all these network settings properly.

How to connect to this server?

If you don’t want to configure a VPN or a FastConnect link for now, you can associate your compute instance to an internet gateway to make it available from everywhere. Security is achieved with SSH keys: you provide your public key(s) on the OCI console for this server, and only you will be able to establish a connexion to your server. Later, a VPN or FastConnect configuration will let you reach all your OCI servers as if they were on your network.

What are the other services available?

If you’re thinking about OCI, it’s probably because you do not only need servers: you need Oracle databases. Actually, you don’t have to provision compute instances to install databases on it. You can directly provision databases, for various versions, Standard or Enterprise Edition, with you own license or without any license (the license fee will be billed as if it were an OCI resource – on a monthly basis). For sure, an underlying server will be provisionned, but you don’t have to create it as a separate task. If you need to connect later to this server, it’s possible as if it were a normal compute instance.

A key feature of OCI is what they call autonomous database: it’s a self-managed database that doesn’t give you access to the server or even the SYSDBA role on the database. You control this kind of DB through a dedicated interface (for loading data for example) and let Oracle automatically manage the classic DBA tasks, even those high-level. Autonomous database comes in two flavours: OLTP or Datawarehouse. Embedded autonomous engine will act differently.

Database services also come with automatic backup you can simply configure when creating the database (or after). Just define what kind of backup you need (mainly choose from various retentions and frequencies) and RMAN will automatically take care of your backups. Restore can be done directly through the OCI console.

Other services are also available, like load balancer or MySQL databases. Some services are free, some come at a cost.

How about the storage?

Multiple storage options are available for your servers depending on your needs:
– block storage: this is similar to LUNs on SAN. Choose the size at block storage creation and plug this storage to your server for a dedicated use
– file storage: this is similar to NFS. A shared storage for multiple servers
– object storage: this storage is usefull to make some files available wherever you need, just by sharing a link

Storage on OCI only relies on SSD disks, so expect high performances regarding I/Os.

How much it costs?

That’s the most difficult question, because you’ll have to define your needs, build your infrastructure on paper, then compute the cost with a cost calculator provided by Oracle. There is two billing options available at this moment: prepaid, with Universal Cloud Credits, or pay-as-you-go based on service utilization. The costs may vary depending on multiple parameters. Base budget for an OCI infrastructure starts from 1000$ a month. Don’t expect an OCI infrastructure to be much less expensive than on-premise servers: it’s mainly interesting because you don’t bother with budgeting, buying, deploying, managing servers on your own. And think about how quick you can deploy a new environment, or destroy an old one. It’s another way of spending your IT budget.

The cost calculator is here.

Conclusion

OCI is a mature Cloud, ready for production and with multiple services available and evolving constantly. Test-it to discover how powerfull it is and make sure to understand all the benefits you can get compared to on-premise solutions.

Cet article Oracle Cloud basics for beginners est apparu en premier sur Blog dbi services.

Attaching your own CentOS 7 yum repository to AWS SSM

Thu, 2020-06-25 06:02

From some blogs I’ve written in the past you might already know that we are using AWS SSM to patch and maintain the complete EC2 and onprem instances at one of our customers. The previous posts about that topic are here:

While that in general is working fine and fully automated we ran into an issue lately which forced us to create our own CentOS 7 repositories and use them with SSM to apply the patches to the CentOS machines.

To describe the issue: We have two patch baselines per operating system. One for all development and test systems that applies all patches that are released up until the date the patch baseline is running. Then we have second one for the production systems with an approval delay of 14 days. As we run production patching 2 weeks after we patched the development and test systems that should guarantee that we get the same patches applied to production. And exactly here is the issue: “if a Linux repository doesn’t provide release date information for packages, Systems Manager uses the build time of the package as the auto-approval delay for Amazon Linux, Amazon Linux 2, RHEL, and CentOS. If the system isn’t able to find the build time of the package, Systems Manager treats the auto-approval delay as having a value of zero.”. That basically means: As you never know when CentOS will release their patches, which are based on the RedHat sources, you can never be sure that you get the same patches applied to production as they were applied 14 days before to development and test. Lets do an example: Our patching for development and test happened the 10th of April. The kernel package v3.10.0-1127 was released for CentOS on April 27th and was therefore not applied to the development and test systems. When production patching happened two weeks later that kernel package was available but also satisfied our auto approval rule of 14 days. So we basically had a patch installed on the production which never made it to the development and test systems. This is why we decided to go for our own repositories so we can decide when the repositories are synced.

Setting up a local yum repository is quite easy and you can find plenty of howtos in the internet, so here is just a summary without much explanation. We deployed a new CentOS 7 EC2 instance, then installed a webserver and the epel repository:

[centos@ip-10-47-99-158 ~]$ sudo yum install epel-release nginx -y
[centos@ip-10-47-99-158 ~]$ sudo systemctl start nginx
[centos@ip-10-47-99-158 ~]$ sudo systemctl enable nginx
[centos@ip-10-47-99-158 ~]$ sudo systemctl status nginx

As yum gets the packages over http or https adjust the firewall rules:

[centos@ip-10-47-99-158 ~]$ sudo systemctl start firewalld
[centos@ip-10-47-99-158 ~]$ sudo systemctl enable firewalld
[centos@ip-10-47-99-158 ~]$ sudo firewall-cmd --zone=public --permanent --add-service=http
[centos@ip-10-47-99-158 ~]$ sudo firewall-cmd --zone=public --permanent --add-service=https
[centos@ip-10-47-99-158 ~]$ sudo firewall-cmd --reload

Update the complete system and install the yum utilities and the createrepo packages:

[centos@ip-10-47-99-158 ~]$ sudo yum update -y
[centos@ip-10-47-99-158 ~]$ sudo yum install createrepo  yum-utils -y

Prepare the directory structure and synchronize the repositories:

[centos@ip-10-47-99-158 ~]$ sudo mkdir -p /var/www/html/repos
[centos@ip-10-47-99-158 ~]$ sudo chmod -R 755 /var/www/html/repos
[centos@ip-10-47-99-158 ~]$ sudo reposync -g -l -d -m --repoid=base --newest-only --download-metadata --download_path=/var/www/html/repos/centos-7/7/
[centos@ip-10-47-99-158 ~]$ sudo reposync -l -d -m --repoid=extras --newest-only --download-metadata --download_path=/var/www/html/repos/centos-7/7/
[centos@ip-10-47-99-158 ~]$ sudo reposync -l -d -m --repoid=updates --newest-only --download-metadata --download_path=/var/www/html/repos/centos-7/7/
[centos@ip-10-47-99-158 ~]$ sudo reposync -l -d -m --repoid=epel --newest-only --download-metadata --download_path=/var/www/html/repos/centos-7/7/

Create the repositories from what was synced above:

[centos@ip-10-47-99-158 ~]$ sudo createrepo /var/www/html/repos/centos-7/7/base
[centos@ip-10-47-99-158 ~]$ sudo createrepo /var/www/html/repos/centos-7/7/extras
[centos@ip-10-47-99-158 ~]$ sudo createrepo /var/www/html/repos/centos-7/7/updates
[centos@ip-10-47-99-158 ~]$ sudo createrepo /var/www/html/repos/centos-7/7/epel

… and set the selinux context:

[centos@ip-10-47-99-158 ~]$ sudo semanage fcontext -a -t httpd_sys_content_t "/var/www/html/repos(/.*)?"
[centos@ip-10-47-99-158 ~]$ sudo restorecon -Rv /var/www/html/repos

Configure nginx to point to the repositories:

[centos@ip-10-47-99-158 ~]$ sudo vi /etc/nginx/conf.d/repos.conf 
## add the folling section
server {
        listen   80;
        server_name  10.47.99.158;	
        root   /var/www/html/repos/;
        location / {
                index  index.php index.html index.htm;
                autoindex on;	#enable listing of directory index
        }
}

… and restart the webserver:

[centos@ip-10-47-99-158 ~]$ sudo systemctl restart nginx

From now on you should see the directory structure when you point your browser to the IP of the EC2 instance:

To regularly synchronize the repositories depending on your requirements create a small script that does the job and schedule that with cron, e.g.:

#!/bin/bash
LOCAL_REPOS="base extras updates epel"
##a loop to update repos one at a time
for REPO in ${LOCAL_REPOS}; do
    if [ "$REPO" = "base" ]; then
        reposync -g -l -d -m --repoid=$REPO --newest-only --download-metadata --download_path=/var/www/html/repos/centos-7/7/
    else
        reposync -l -d -m --repoid=extras --newest-only --download-metadata --download_path=/var/www/html/repos/centos-7/7/
    fi
    createrepo /var/www/html/repos/centos-7/7/$REPO
    semanage fcontext -a -t httpd_sys_content_t "/var/www/html/repos(/.*)?"
    restorecon -Rv /var/www/html/repos
done

Test the repository from another CentOS 7 instance:

Using username "centos".
Authenticating with public key "imported-openssh-key"
[centos@ip-10-47-98-80 ~]$ sudo bash
[root@ip-10-47-98-80 centos]$ cd /etc/yum.repos.d/
[root@ip-10-47-98-80 yum.repos.d]$ ls
CentOS-Base.repo  CentOS-CR.repo  CentOS-Debuginfo.repo  CentOS-fasttrack.repo  CentOS-Media.repo  CentOS-Sources.repo  CentOS-Vault.repo
[root@ip-10-47-98-80 yum.repos.d]$ rm -f *
[root@ip-10-47-98-80 yum.repos.d]$ ls -la
total 12
drwxr-xr-x.  2 root root    6 Jun 25 06:39 .
drwxr-xr-x. 77 root root 8192 Jun 25 06:36 ..


[root@ip-10-47-98-80 yum.repos.d]$ cat local-centos.repo
[local]
name=CentOS Base
baseurl=http://10.47.99.158/centos-7/7/base/
gpgcheck=0
enabled=1

[extras]
name=CentOS Extras
baseurl=http://10.47.99.158/centos-7/7/extras/
gpgcheck=0
enabled=1

[updates]
name=CentOS Updates
baseurl=http://10.47.99.158/centos-7/7/updates/
gpgcheck=0

[epel]
name=CentOS Updates
baseurl=http://10.47.99.158/centos-7/7/epel/
gpgcheck=0
[root@ip-10-47-98-80 yum.repos.d]#


[root@ip-10-47-98-80 yum.repos.d]$ yum search wget
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
======================================================================= Name Exactly Matched: wget =======================================================================
wget.x86_64 : A utility for retrieving files using the HTTP or FTP protocols

  Name and summary matches only, use "search all" for everything.
[root@ip-10-47-98-80 yum.repos.d]#

… and you’re done from a repository perspective.

Now it is time to tell SSM to use the local repositories with your patch baseline. If you don’t know how SSM works or how you can apply patches using SSM check the previous post.

All you need to do is to adjust the patch baseline to include your repositories as “Patch sources”:

Schedule your patching and then check the logs. You should see that SSM is now using the local repositories:

...
u'sources': [{u'configuration': u'[local]\nname=CentOS Base\nbaseurl=http://10.47.99.158/centos-7/7/base/\ngpgcheck=0\nenabled=1', u'products': [u'*'], u'name': u'base'}, {u'configuration': u'[extras]\nname=CentOS Extras\nbaseurl=http://10.47.99.158/centos-7/7/extras/\ngpgcheck=0\nenabled=1', u'products': [u'*'], u'name': u'extras'}, {u'configuration': u'[updates]\nname=CentOS Updates\nbaseurl=http://10.47.99.158/centos-7/7/updates/\ngpgcheck=0', u'products': [u'*'], u'name': u'updates'}
...
06/25/2020 10:00:28 root [INFO]: Moving file: CentOS-Base.repo
06/25/2020 10:00:28 root [INFO]: Moving file: CentOS-CR.repo
06/25/2020 10:00:28 root [INFO]: Moving file: CentOS-Debuginfo.repo
06/25/2020 10:00:28 root [INFO]: Moving file: CentOS-Media.repo
06/25/2020 10:00:28 root [INFO]: Moving file: CentOS-Sources.repo
06/25/2020 10:00:28 root [INFO]: Moving file: CentOS-Vault.repo
06/25/2020 10:00:28 root [INFO]: Moving file: CentOS-fasttrack.repo
06/25/2020 10:00:28 root [INFO]: Moving file: CentOS-x86_64-kernel.repo
06/25/2020 10:00:28 root [INFO]: Executing lambda _create_custom_repos
06/25/2020 10:00:28 root [INFO]: Creating custom repo base
06/25/2020 10:00:28 root [INFO]: Creating custom repo extras
06/25/2020 10:00:28 root [INFO]: Creating custom repo updates
06/25/2020 10:00:28 root [INFO]: Creating custom repo epel
Loaded plugins: fastestmirror

That’s it. Doing it this way you have full control about which packages will be installed. The downside, of course, is, that you need to maintain your own copy of the repositories.

Cet article Attaching your own CentOS 7 yum repository to AWS SSM est apparu en premier sur Blog dbi services.

Some myths about PostgreSQL vs. Oracle

Wed, 2020-06-24 12:42
By Franck Pachot

.
I originally wrote this as a comment on the following post that you may find on internet:
https://www.2ndquadrant.com/en/blog/oracle-to-postgresql-reasons-to-migrate/
but my comment was not published (many links in it… I suppose it has been flagged as spam?) so I put it there.

You should never take any decision on what you read on the internet without verifying. It is totally valid to consider a move to Open Source databases, but doing it without good understanding is a risk for your migration project success.

In italics are the quotes from the article.

Kirk,
As you do a comparison and link to a list of PostgreSQL features, let me refine the name and description of the Oracle features you compare to, so that people can find them and do a fair comparison. I’m afraid they may not recognize the names and descriptions you provide, at least in current versions. As an example, nobody will get search hits for “Federation”, or “plSQL”, or “HTML DB”… in the Oracle documentation but they will find “Oracle Gateway”, “PL/SQL”, “APEX”…

Federation vs. Foreign Data Wrappers


There is no feature called “Federation”. 
The closest from your description is Database links and Heterogeneous Services through Database Gateway. They go further than FDW in many points. But anyway, I would never use that for ETL. ETL needs optimized bulk loads and there are other features for that (like External Tables to read files, and direct-path inserts to fast load). If your goal is to federate and distribute some small reference tables, then Materialized Views is the feature you may look for.
https://docs.oracle.com/en/database/oracle/oracle-database/20/heter/introduction.html#GUID-EC402025-0CC0-401F-AF93-888B8A3089FE

plSQL vs. everything else


“Oracle has a built-in programming language called plSQL.”
PL/SQL is more than that. It is compiled (to pcode or native), manages dependencies (tracks dependencies on schema objects), optimized for data access (UDF can even be compiled to run within the SQL engine), can be multithreaded (Parallel Execution). That’s different from PL/pgSQL which is interpreted at execution time. You mention languages as “as plug-ins” and for this, there are other ways to run different languages (external procedures, OJCM, External Table preprocessor,…) but when it comes to performance, transaction control, dependency tracking,… that’s PL/SQL.
https://docs.oracle.com/en/database/oracle/oracle-database/20/lnpls/overview.html#GUID-17166AA4-14DC-48A6-BE92-3FC758DAA940

Application programming


Providing an “API to communicate with the database” is not about open source as the main goal is: encapsulation and hide implementation details. In order to access internal structures, which is what you mention, Oracle provides relational views (known as V$ views) accessible with the most appropriate API for a relational database: SQL
https://docs.oracle.com/en/database/oracle/oracle-database/20/refrn/dynamic-performance-views.html#GUID-8C5690B0-DE10-4460-86DF-80111869CF4C

Internationalization and Localization


The “globalization toolkit” is only one part of the globalization features. You can also use any “any character encoding, collation and code page” but not relying on the OS implementation of it makes it cross-platform compatible and OS upgrade compatible (see https://wiki.postgresql.org/wiki/Locale_data_changes)
https://docs.oracle.com/en/database/oracle/oracle-database/20/nlspg/overview-of-globalization-support.html#GUID-6DD587EE-6686-4802-9C08-124B495978D5

Web Development


“Oracle acknowledges the existence of HTML through HTML DB. PostgreSQL natively supports JSON, XML and plugs in Javascript”. HTML DB can be found in paper books, but the name is “APEX” since 2006. And it is not (only) about HTML, JSON, or XML but is a low-code Rapid Application Development with no equivalent for other databases.
Support for the structures and languages you mention are all there. The latest trend being JSON: https://docs.oracle.com/en/database/oracle/oracle-database/20/adjsn/index.html

Authentication


“Oracle has a built-in authentication system.”
Yes, to be platform-independent, and has many other External Authentication: https://docs.oracle.com/en/database/oracle/oracle-database/20/dbseg/configuring-authentication.html#GUID-BF8E5E84-FE7E-449C-8081-755BAA4CF8DB

Extensibility


“Oracle has a plug-in system”. I don’t know what you are referring to. Oracle is multi-platform proprietary software. Commercial, which means with vendor supported. There are a lot of APIs for extensions, but the vendor must have to control what runs in the engine in order to provide support.

Read Scalability


“PostgreSQL can create a virtually unlimited read cluster”. Oracle has active/active cluster (called RAC) and read replicas (called Active Data Guard). For horizontal scalability, you use the same as for vertical (Parallel Execution) across multiple nodes (in sync, with instance affinity on partitions,…)
https://docs.oracle.com/en/database/oracle/oracle-database/20/vldbg/parallel-exec-intro.html#GUID-F9A83EDB-42AD-4638-9A2E-F66FE09F2B43

Cost


“they don’t mind charging you again for every single instance.” 
No, that’s wrong, license metrics are on processors (CPU) or users (NUP). You run as many instances as you want on your licensed servers for your licensed users: https://www.oracle.com/a/ocom/docs/corporate/oracle-software-licensing-basics.pdf
“jamming everything into a single instance just to reduce costs”
No, database consolidation is recommended to scale the management of multiple databases, but not for licensing costs. If you go there, there are a lot of features to allow isolation and data movement in consolidated databases: Multitenant, Resource Manager, Online Relocate, Lockdown Profiles,…

Performance


“differentiate the tuning parameters for your warehouse to OLTP to reporting to the data lake”: I already mentioned the point about read replicas and about multiple instances in a server. But with oracle, all the parameters I want to set different for OLTP or reporting do not require another instance. They can be set at session or PDB level. As Oracle does not need the filesystem buffer cache, there’s no need to separate on different servers to avoid noisy neighbours. 

I hope this helps to look further at the features. There are many reasons to migrate and the main one is the will to move from a commercial model (with license and support) to an open-source one (start with low cost, help from community). But decision must be made on facts and not rumours.

Franck.

Cet article Some myths about PostgreSQL vs. Oracle est apparu en premier sur Blog dbi services.

Azure Migrate: how to assess on-premises servers for a future Azure migration

Thu, 2020-06-18 02:14

Azure Migrate provides a set of tools to assess and migrate on-premise infrastructure like Hyper-V or VMware virtual machines, physical servers as well as SQL and other Databases, web application…
The Azure Migrate overview page resumes the different key scenario available.


Lets have a look how easily you can assess your Hyper-V virtual machines for a future migration to Azure via this migration tool.
You need to add a tool to discover your VMs but before that you need to create a specific Resource Group for this Azure Migration, give a name to this project and a preferred location for your future Azure Virtual Machines.

Once done you need to select the assessment tool followed by the migration tool. Personally, I choose the Azure ones but others ISV (independent software vendor) offerings are available.
At the end of this process you can review your selection and if you are satisfied with your selection click on “Add tools” button:

The next step is to discover your on-premise servers, here my Virtual machines. To do it you have two possibilities:

  • deploy an Azure migrate appliance, this appliance stays connected to Azure Migrate and will continuously discover your on-premises machines.
  • use a CSV file to import an inventory of the VMs you would like to quickly assess to have a cost and also a compatibility view. If you need an accurate assessment you need to use the appliance, same if you want to migrate the assessed servers.

For my first text I used a CSV file containing 3 Virtual machines. Once imported you can see in the Azure migrate Server Assessment my 3 discovered servers.

If you click on discovered servers, you will have an overview of the server and you can see that they have been imported and not discovered by an appliance:

You can now create a group which is a collection of machines you would like to assess and migrate together. In this example I took two machines from my imported ones:

Now that your group is created you can run an assessment for this group by:

  • selecting Imported machines
  • give a name to your assessment
  • choose the sizing criteria for your assessment:
    • Performance-based: based on performance-data values specified
    • As on-premises: based on on-premises sizing
  • select or create a new group
  • click the “Create assessment” button:

After some minutes the assessment is available, you are able to visualize your assessment with 3 parts:

  • Azure Readiness: tell you if your servers are ready for migration to Azure or if not shows problems & possible remediation
  • Monthly cost estimate: sum based on size recommendations for Azure VMs (compute) and associated VMs storage
  • Storage – Monthly cost estimate: total storage cost split by storage type

You can have more information by clicking on each part of the assessment.
On Azure Readiness you can see the type of Azure Virtual Machine which have been selected during the assessment, the number of disks per on-premises VM and their size:

If you click on a specific on-premise machine you will have a more detailed view:

  • a green light for the migration + the Azure VM size + the estimated monthly coast divided between compute and storage
  • the Operating System used on your VM
  • A compute resumes with the number of cores and the size of the RAM
  • the selected storage with the number of disks, the target disk type & size

As written in the beginning of this blog I cannot migrate my on-premises Virtual Machines after this assessment as I didn’t use the Azure Migrate appliance to discover my on-premises servers. Nevertheless, it’s a good starting point to have a better feeling of the cost if you want to migrate to Azure.
I will definitively deploy an appliance on my next blog post to have an accurate assessment and follow it by a migration.

Cet article Azure Migrate: how to assess on-premises servers for a future Azure migration est apparu en premier sur Blog dbi services.

Documentum – Change/Rename of LDAP Config object

Wed, 2020-06-17 14:04

Using an LDAP Server with Documentum (or any ECM for that matter) is pretty common to avoid managing users locally. In this blog, I wanted to talk about something that isn’t very common and that is the change of an LDAP Server. I’m not talking about just changing the server where your LDAP is hosted but rather changing the full LDAP Server Config inside Documentum (including its name). This is probably something that you will not do very often but I had this use case before so I thought it would be interesting to share.

 

So the use case I had was the following one: during a migration (including an upgrade in the process) from Virtual Machines to Kubernetes pods of dozens of environments, I had to automate the setup & management of the LDAP Server as well as normalize the configuration (name & other parameters) according to certain characteristics. The source and target LDAP Server was the same (a_application_type: netscape) so it was really just a matter of automation and conventions (if the target isn’t the same, it wouldn’t change anything for this blog). As you know if you already came across one of my previous blogs, DA is doing some magic, which prevents you to really manage the LDAP Server in the same way between DA and automation.

 

Therefore, the first part of the use case could have just been to change the LDAP Config object “certdb_location” parameter from the default “ldapcertdb_loc” to another “dm_location” which is using a “file” “path_type” and not a “directory” one. If you understood that last sentence, well done! If you didn’t, don’t worry, I would just suggest you to go and read the blog I linked above. It shows why you would need to use a dedicated dm_location with a path_type set to a file (that is the SSL Certificate of the LDAP Server) in order to automate the setup of an LDAP Config object. Of course, this is only needed if you are using SSL communications with the LDAP Server (i.e. LDAPS). Therefore, that would have been the simple part that is easily understandable, however it is another story if you also need to normalize the LDAP Config object setup in the process. Changing parameters of the LDAP Config object is easy and shouldn’t bring any issues or interruptions, but it’s another story if you need to change the name of the LDAP object… If the field is greyed out in DA, it’s for a reason ;).

 

The dozens of environments were all using the exact same backend LDAP Server but there were still some small differences in the setup like different names of course, different attributes mapped here and there, things like that. To be sure everything would be aligned on the target, I took the simplest road: removing all the source LDAP Config objects and recreating them in our CI/CD (JenkinsX pipelines, Ansible playbooks, aso…). Before removing anything, make sure that you understand what it means for your specific setup, it might not always be that straightforward! Alright, let’s check the status of the LDAP at the beginning:

[dmadmin@stg_cs ~]$ export docbase=REPO1
[dmadmin@stg_cs ~]$
[dmadmin@stg_cs ~]$ iapi ${docbase} -U${USER} -Pxxx << EOC
> ?,c,select r_object_id, object_name from dm_ldap_config;
> ?,c,select ldap_config_id from dm_server_config;
> ?,c,select user_source, count(*), user_login_domain from dm_user group by user_login_domain, user_source;
> EOC


        EMC Documentum iapi - Interactive API interface
        (c) Copyright EMC Corp., 1992 - 2016
        All rights reserved.
        Client Library Release 7.3.0040.0025


Connecting to Server using docbase REPO1
[DM_SESSION_I_SESSION_START]info:  "Session 010f123451ad5e0f started for user dmadmin."


Connected to Documentum Server running Release 7.3.0050.0039  Linux64.Oracle
Session id is s0
API> 
r_object_id       object_name
----------------  ----------------
080f1234500edaf3  Source_LDAP_Name
(1 row affected)

API> 
ldap_config_id
----------------
080f1234500edaf3
080f1234500edaf3
080f1234500edaf3
(3 rows affected)

API> 
user_source       count(*)                user_login_domain
----------------  ----------------------  -----------------
LDAP                               55075  Source_LDAP_Name
inline password                      307  
                                    1466  
(3 rows affected)

API> Bye
[dmadmin@stg_cs ~]$

 

As you can see above, there is currently one LDAP Config object that is used by around 55k users. There are three rows for the second query because it’s a HA Repository with 3 Content Servers. The first thing I did was therefore to remove the LDAP Config object and change the different references to prepare for the new name of the object (you can put the two last update commands in just one query, it would be better but I split them here for this example):

[dmadmin@stg_cs ~]$ export old_ldap_name=Source_LDAP_Name
[dmadmin@stg_cs ~]$ export new_ldap_name=Target_LDAP_Name
[dmadmin@stg_cs ~]$
[dmadmin@stg_cs ~]$ iapi ${docbase} -U${USER} -Pxxx << EOC
> ?,c,delete dm_ldap_config objects where object_name='${old_ldap_name}';
> ?,c,update dm_server_config object set ldap_config_id='0000000000000000';
> ?,c,update dm_user object set user_login_domain='${new_ldap_name}' where user_login_domain='${old_ldap_name}' and user_source='LDAP';
> ?,c,update dm_user object set user_global_unique_id='' where user_login_domain='${new_ldap_name}' and user_source='LDAP';
> EOC


        EMC Documentum iapi - Interactive API interface
        (c) Copyright EMC Corp., 1992 - 2016
        All rights reserved.
        Client Library Release 7.3.0040.0025


Connecting to Server using docbase REPO1
[DM_SESSION_I_SESSION_START]info:  "Session 010f123451ad5e11 started for user dmadmin."


Connected to Documentum Server running Release 7.3.0050.0039  Linux64.Oracle
Session id is s0
API> 
objects_deleted
---------------
              1
(1 row affected)
[DM_QUERY_I_NUM_UPDATE]info:  "1 objects were affected by your DELETE statement."


API> 
objects_updated
---------------
              3
(1 row affected)
[DM_QUERY_I_NUM_UPDATE]info:  "3 objects were affected by your UPDATE statement."


API> 
objects_updated
---------------
          55075
(1 row affected)
[DM_QUERY_I_NUM_UPDATE]info:  "55075 objects were affected by your UPDATE statement."


API> 
objects_updated
---------------
          55075
(1 row affected)
[DM_QUERY_I_NUM_UPDATE]info:  "55075 objects were affected by your UPDATE statement."


API> Bye
[dmadmin@stg_cs ~]$

 

At the moment, the new LDAP Config object wasn’t created yet but to avoid any disturbance, I already changed the “user_login_domain” from the old name to the new one. Another point to note is the reset of the “user_global_unique_id” parameter to an empty value. As soon as users are synchronized using an LDAP, for example, they will be assigned with a unique identity. For an LDAP account, the id should be something like “###dm_ldap_config.object_name###:###random_numbers###” (E.g.: Source_LDAP_Name:078b1d9f-f4c35cd2-cad867c1-1f4a7872) while for an inline account it should be more “###dm_docbase_config.object_name###:###dm_user.user_name###” (E.g.: REPO1:Patou Morgan). If you just change the LDAP Config object name and update the “user_login_domain” of the user without anything else, in the end, the users won’t be able to login and if you try to execute the LDAP Sync, you should see some warnings showing that your users haven’t been synched (skipped). This is because the user already exist with a different identity. The message on the LDAP Sync would be something like “WARNING: A User with same user_name (Patou Morgan) exists in the docbase with a different identity”. The identity is generated by Documentum and written in the dm_user object under the “user_global_unique_id” attribute. Setting this to an empty value will allow Documentum to generate a new identity so that the user can work with the new LDAP.

 

Verification after the deletion:

[dmadmin@stg_cs ~]$ iapi ${docbase} -U${USER} -Pxxx << EOC
> ?,c,select r_object_id, object_name from dm_ldap_config;
> ?,c,select ldap_config_id from dm_server_config;
> ?,c,select user_source, count(*), user_login_domain from dm_user group by user_login_domain, user_source;
> EOC


        EMC Documentum iapi - Interactive API interface
        (c) Copyright EMC Corp., 1992 - 2016
        All rights reserved.
        Client Library Release 7.3.0040.0025


Connecting to Server using docbase REPO1
[DM_SESSION_I_SESSION_START]info:  "Session 010f123451ad5e12 started for user dmadmin."


Connected to Documentum Server running Release 7.3.0050.0039  Linux64.Oracle
Session id is s0
API> 
r_object_id       object_name
----------------  ----------------
(0 rows affected)

API> 
ldap_config_id
----------------
0000000000000000
0000000000000000
0000000000000000
(3 rows affected)

API> 
user_source       count(*)                user_login_domain
----------------  ----------------------  -----------------
LDAP                               55075  Target_LDAP_Name
inline password                      307  
                                    1466  
(3 rows affected)

API> Bye
[dmadmin@stg_cs ~]$

 

Once that was done, I simply re-created a new LDAP Config object as described in the blog I linked at the beginning of this post and using the new name “Target_LDAP_Name” (including the re-encryption of the password). Then running the LDAP Sync to make sure everything is working properly and checking the user_global_unique_id of some LDAP users to make sure it has been regenerated properly. It should be something like “Source_LDAP_Name:078b1d9f-f4c35cd2-cad867c1-1f4a7872” before and then updated to something like “Target_LDAP_Name:2d7c130d-84a47969-926907fa-f1649678” after the change of LDAP Config object name (+ execution of LDAP Sync) – obviously the names and the IDs will change in your case.

 

Cet article Documentum – Change/Rename of LDAP Config object est apparu en premier sur Blog dbi services.

Pages