diff --git a/README.md b/README.md index b7ba485c8a1992adc466b3d6b4a9bcb305a88f6e..f93cf7060920d898b1b1ac667987697e66afb784 100644 --- a/README.md +++ b/README.md @@ -33,14 +33,14 @@ You are then ready to go. A full documentation will come soon. ### Fetch one time series by ID First, let's assume that we know which series we want to download. A series identifier (ID) is defined by three values, formatted like this: `provider_code/dataset_code/series_code`. -The `fetch_series` function is used to construct the cell array. +The `mdbnomics` function is used to construct the cell array. For example, to fetch the time series `EA19.1.0.0.0.ZUTN` from the [\"Unemployment rate\" [ZUTN] dataset](https://db.nomics.world/AMECO/ZUTN) belonging to the [AMECO provider](https://db.nomics.world/AMECO). Example: - >> df_id = fetch_series('series_ids', 'AMECO/ZUTN/EA19.1.0.0.0.ZUTN'); + >> df_id = mdbnomics('series_ids', 'AMECO/ZUTN/EA19.1.0.0.0.ZUTN'); The returned data is stored in the `df_id` variable. Its type is a cell array. To display the first 3 rows of the array as a table (including the column headers), type: @@ -77,14 +77,14 @@ Followed by dimensions columns, corresponding to the dimensions of the dataset: ### Fetch two time series by ID Again, let's assume that we know which series we want to download. -We can reuse the `fetch_series` function, this time with two series codes. +We can reuse the `mdbnomics` function, this time with two series codes. For example, to fetch the time series `EA19.1.0.0.0.ZUTN` and `DNK.1.0.0.0.ZUTN` from the [\"Unemployment rate\" [ZUTN] dataset](https://db.nomics.world/AMECO/ZUTN) belonging to the [AMECO provider](https://db.nomics.world/AMECO). Example: - >> df_ids = fetch_series('series_ids', {'AMECO/ZUTN/EA19.1.0.0.0.ZUTN', 'AMECO/ZUTN/DNK.1.0.0.0.ZUTN'}); + >> df_ids = mdbnomics('series_ids', {'AMECO/ZUTN/EA19.1.0.0.0.ZUTN', 'AMECO/ZUTN/DNK.1.0.0.0.ZUTN'}); ### Fetch time series by code mask The code mask notation is a very concise way to select one or many time series at once. @@ -106,9 +106,9 @@ Given 3 dimensions 'frequency', 'country' and 'indicator', the user can select: Examples: - >> df_code_mask1 = fetch_series('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', 'M.FR+DE.PCPIEC_IX+PCPIA_IX'); - >> df_code_mask2 = fetch_series('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', '.FR.PCPIEC_WT'); - >> df_code_mask3 = fetch_series('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', 'M..PCPIEC_IX+PCPIA_IX', 'max_nb_series', 400); + >> df_code_mask1 = mdbnomics('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', 'M.FR+DE.PCPIEC_IX+PCPIA_IX'); + >> df_code_mask2 = mdbnomics('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', '.FR.PCPIEC_WT'); + >> df_code_mask3 = mdbnomics('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', 'M..PCPIEC_IX+PCPIA_IX', 'max_nb_series', 400); ### Fetch time series by dimension Searching by dimension is a less concise way to select time series than using the code mask, but it's universal: @@ -118,34 +118,100 @@ and the indicator "Procedures required to start a business - Women (number)" (`i Example: - >> df_dims = fetch_series('provider_code', 'WB', 'dataset_code', 'DB', 'dimensions', '{"country":["ES","FR","IT"],"indicator":["IC.REG.COST.PC.FE.ZS.DRFN"]}'); + >> df_dims = mdbnomics('provider_code', 'WB', 'dataset_code', 'DB', 'dimensions', '{"country":["ES","FR","IT"],"indicator":["IC.REG.COST.PC.FE.ZS.DRFN"]}'); ### Fetch time series by API link When the dimensions, provider, dataset or series codes are unknown, the user can: * go to the page of a dataset on DBnomics website (eg: [Doing Business](https://db.nomics.world/WB/DB)) * select some dimensions by using the input widgets of the left column * click on `Copy API link` in the menu of the `Download` button -* use the `fetch_series_by_api_link` function +* use the `mdbnomics` function with the `api_link` parameter. Example: - >> df_link = fetch_series_by_api_link('https://api.db.nomics.world/v22/series/WB/DB/ENF.CONT.COEN.ATDR-AE?observations=1'); + >> df_link = mdbnomics('api_link', 'https://api.db.nomics.world/v22/series/WB/DB/ENF.CONT.COEN.ATDR-AE?observations=1'); ### Fetch time series from the cart -On the [cart page](https://db.nomics.world/cart) of the DBnomics website, click on "Copy API link" and copy-paste it as an argument of the fetch_series_by_api_link function. +On the [cart page](https://db.nomics.world/cart) of the DBnomics website, click on "Copy API link" and copy-paste it as an argument of the `mdbnomics` function. Please note that when you update your cart, you have to copy this link again, because the link itself contains the IDs of the series in the cart. Example: - >> df_cart = fetch_series_by_api_link('https://api.db.nomics.world/v22/series?series_ids=AMECO%2FZUTN%2FEA19.1.0.0.0.ZUTN&observations=1'); + >> df_cart = mdbnomics('api_link', 'https://api.db.nomics.world/v22/series?series_ids=AMECO%2FZUTN%2FEA19.1.0.0.0.ZUTN&observations=1'); ### Fetch time series with different frequencies Example: - >> df_multi_freq = fetch_series('series_ids', {'BEA/NIUnderlyingDetail-U001BC/S315-A',... - 'BEA/NIUnderlyingDetail-U001BC/S315-Q',... - 'BEA/NIUnderlyingDetail-U001BC/S315-M'}); + >> df_multi_freq = mdbnomics('series_ids', {'BEA/NIUnderlyingDetail-U001BC/S315-A',... + 'BEA/NIUnderlyingDetail-U001BC/S315-Q',... + 'BEA/NIUnderlyingDetail-U001BC/S315-M'}); + +### Fetch the available datasets of a provider +When fetching series from DBnomics, the user needs to give a provider and a dataset before specifying correct dimensions. +With the function `mdbnomics_datasets`, the user can download the list of the available datasets for a provider. +If no `provider_code` was supplied, an array of all datasets for every provider is returned. + +Example: + + >> datasets = mdbnomics_datasets('provider_code', 'IMF'); + +The result is a structure with with a cell array containing the dataset codes and names of the requested providers. +With the same function, if the user wants to fetch the available datasets for multiple providers, a cell array of providers has to be given. + +Example: + + >> datasets = mdbnomics_datasets('provider_code', {'IMF', 'BDF'}); + +In the event that the user only requests the datasets for one provider, if `simplify` is defined as `true`, then the result will be a simple cell array, not a structure. + +Example: + + >> datasets = mdbnomics_datasets('provider_code', 'IMF', 'simplify', true); + +### Fetch the possible dimensions of available datasets of a provider +When fetching series from DBnomics, it can be interesting and especially useful to specify dimensions for a particular dataset to download only the series you want to analyse. With the function `mdbnomics_dimensions`, +the user can download these dimensions and their meanings. + +Example: + + >> datasets = mdbnomics_dimensions('provider_code', 'IMF', 'dataset_code', 'WEO'); + +The result is a nested structure (its names are IMF_WEO and the dimensions names) with a structure at the end of each branch. + +In the event that the user only requests the dimensions for one dataset for one provider, if `simplify` is defined as `true`, then the result will be a simple structure, not a nested one. + +Example: + + >> datasets = mdbnomics_dimensions('provider_code', 'IMF', 'dataset_code', 'WEO', 'simplify', true); + +To download the dimensions of every dataset gathered by DBnomics, the user does not have to set any arguments. + +Example: + + >> datasets = mdbnomics_dimensions(); + +### Fetch the series codes and names of available datasets of a provider +The user can download the list of series, and especially their codes, of a dataset’s provider by using the function `mdbnomics_series`. The result is a structure with a cell array at the end of each branch. If `simplify` is defined as `true`, +then the result will be a simple cell array. + +Example: + + >> series = mdbnomics_series('provider_code', 'IMF', 'dataset_code', 'WEO', 'simplify', true); + +Like the function `mdbnomics()`, features can be added to `mdbnomics_series()`. The user can ask for the series with specific dimensions: + +Example: + + >> series = mdbnomics_series('provider_code', 'IMF', 'dataset_code', 'WEO', 'dimensions', '{"weo-subject":["NGDP_RPCH"]}', 'simplify', true); + +or with a query: + +Example: + + >> series = mdbnomics_series('provider_code', 'IMF', 'dataset_code', 'WEO', 'query', 'NGDP_RPCH'); + +> :warning: **We ask the user to use this function parsimoniously because there are a huge amount of series per dataset. Please only fetch for one dataset if you need it or visit the DBnomics website.** ### Transform time series The routines can interact with the [Time Series Editor](https://editor.nomics.world/) to transform time series by applying filters to them. @@ -157,7 +223,7 @@ Here is an example of how to interpolate two annual time series with a monthly f Example: >> filters_ = '[{"code": "interpolate", "parameters": {"frequency": "monthly", "method": "spline"}}]'; - >> df_filter = fetch_series('series_ids', 'AMECO/ZUTN/EA19.1.0.0.0.ZUTN', 'dbnomics_filters', filters_); + >> df_filter = mdbnomics('series_ids', 'AMECO/ZUTN/EA19.1.0.0.0.ZUTN', 'dbnomics_filters', filters_); The first row of the final cell array changes when filters are used: * `period_middle_day`: the middle day of `original_period` (can be useful when you compare graphically interpolated series and original ones) diff --git a/src/initialize_mdbnomics.m b/src/initialize_mdbnomics.m index a850b622c24ad357d7e1d6e3d13b83aed0bb64f1..da46695fffaa6e44863891e78b51a73221f22c54 100644 --- a/src/initialize_mdbnomics.m +++ b/src/initialize_mdbnomics.m @@ -18,6 +18,8 @@ function initialize_mdbnomics() % along with Dynare. If not, see <http://www.gnu.org/licenses/>. % Get the path to the mdbnomics toolbox. +global mdb_options + mdbnomics_src_root = strrep(which('initialize_mdbnomics'), 'initialize_mdbnomics.m', ''); % Set the subfolders to be added in the path. @@ -58,4 +60,10 @@ if matlab_ver_less_than('9.1') addpath([mdbnomics_src_root '/../contrib/jsonlab']); end +mdb_options.api_base_url = 'https://api.db.nomics.world'; +mdb_options.editor_base_url = 'https://editor.nomics.world'; +mdb_options.api_version = 22; +mdb_options.editor_version = 1; + +assignin('caller', 'mdb_options', mdb_options); assignin('caller', 'mdbnomics_src_root', mdbnomics_src_root); diff --git a/src/fetch_series.m b/src/mdbnomics.m similarity index 77% rename from src/fetch_series.m rename to src/mdbnomics.m index 83012d1787785c8e7563f7a7b4e4af6759186cd5..3ea08434f333215b940da4494897a7cad7d153db 100644 --- a/src/fetch_series.m +++ b/src/mdbnomics.m @@ -1,21 +1,25 @@ -function df = fetch_series(varargin) % --*-- Unitary tests --*-- -% function fetch_series(varargin) +function df = mdbnomics(varargin) % --*-- Unitary tests --*-- +% function mdbnomics(varargin) % Download time series from DBnomics. % Returns a cell array. % % Examples: % Fetch one series: -% fetch_series('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', 'M.FR+DE.PCPIEC_IX+PCPIA_IX'); -% fetch_series('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', '.FR.PCPIEC_WT'); +% mdbnomics('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', 'M.FR+DE.PCPIEC_IX+PCPIA_IX'); +% mdbnomics('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', '.FR.PCPIEC_WT'); % % Fetch all the series of a dataset: -% fetch_series('provider_code', 'AMECO', 'dataset_code', 'UVGD', 'max_nb_series', 500); +% mdbnomics('provider_code', 'AMECO', 'dataset_code', 'UVGD', 'max_nb_series', 500); % % Fetch many series from different datasets: -% fetch_series('series_ids', {'AMECO/ZUTN/EA19.1.0.0.0.ZUTN', 'AMECO/ZUTN/DNK.1.0.0.0.ZUTN', 'IMF/CPI/A.AT.PCPIT_IX'}); +% mdbnomics('series_ids', {'AMECO/ZUTN/EA19.1.0.0.0.ZUTN', 'AMECO/ZUTN/DNK.1.0.0.0.ZUTN', 'IMF/CPI/A.AT.PCPIT_IX'}); % % Fetch many series from the same dataset, searching by dimension: -% fetch_series('provider_code','AMECO', 'dataset_code', 'ZUTN', 'dimensions', '{"geo":["dnk"]}'); +% mdbnomics('provider_code','AMECO', 'dataset_code', 'ZUTN', 'dimensions', '{"geo":["dnk"]}'); +% +% Fetch series given an "API link" URL. +% "API link" URLs can be found on DBnomics web site (https://db.nomics.world/) on dataset or series pages using "Download" buttons. +% mdbnomics('api_link', 'https://api.db.nomics.world/v22/series?series_ids=AMECO%2FZUTN%2FEA19.1.0.0.0.ZUTN&observations=1'); % % POSSIBLE PARAMETERS % provider_code [string] the code of the dataset provider. @@ -26,6 +30,7 @@ function df = fetch_series(varargin) % --*-- Unitary tests --*-- % max_nb_series [integer] maximum number of series requested by the API. If not provided, a default value of 50 series will be used. % api_base_url [string] the base URL used for API requests. If not provided, a default value of: 'https://api.db.nomics.world/v22/' will be used. % dbnomics_filters [char] filters to apply on the requested series. If provided it must be a string formatted like: '[{"code": "interpolate", "parameters": {"frequency": "monthly", "method": "spline"}}]'. +% api_link [char] fetch series given an "API link" URL. % % OUTPUTS % df @@ -52,8 +57,10 @@ function df = fetch_series(varargin) % --*-- Unitary tests --*-- % You should have received a copy of the GNU General Public License % along with Dynare. If not, see <http://www.gnu.org/licenses/>. -default_api_base_url = 'https://api.db.nomics.world/v22/'; -default_editor_base_url = 'https://editor.nomics.world/api/v1/'; +global mdb_options + +default_api_base_url = sprintf('%s/v%d/', mdb_options.api_base_url, mdb_options.api_version); +default_editor_base_url = sprintf('%s/api/v%d/', mdb_options.editor_base_url, mdb_options.editor_version); p = inputParser; validStringInput = @(x) ischar(x) || iscellstr(x); @@ -65,6 +72,7 @@ p.addParameter('series_ids', '',validStringInput); p.addParameter('max_nb_series', NaN, @isnumeric); p.addParameter('api_base_url', default_api_base_url, validStringInput); p.addParameter('dbnomics_filters', '', @ischar); +p.addParameter('api_link', '', @ischar); p.KeepUnmatched = false; p.parse(varargin{:}); @@ -85,7 +93,7 @@ end series_base_url = [p.Results.api_base_url 'series']; if isa(p.Results.dimensions, 'function_handle') && isempty(p.Results.series_code) && isempty(p.Results.series_ids) - if isempty(p.Results.provider_code) || isempty(p.Results.dataset_code) + if (isempty(p.Results.provider_code) || isempty(p.Results.dataset_code)) && isempty(p.Results.api_link) error('When you don''t use dimensions, you must specifiy provider_code and dataset_code.'); end api_link = sprintf('%s/%s/%s?observations=1', series_base_url, p.Results.provider_code, p.Results.dataset_code); @@ -116,12 +124,17 @@ if ~isempty(p.Results.series_ids) end api_link = sprintf('%s?observations=1&series_ids=%s', series_base_url, series_ids); end + +if ~isempty(p.Results.api_link) + api_link = p.Results.api_link; +end + df = fetch_series_by_api_link(api_link, p.Results.dbnomics_filters, p.Results.max_nb_series, default_editor_base_url); end %@test:1 % test_fetch_series_by_code %$ try -%$ df = fetch_series('provider_code', 'AMECO', 'dataset_code', 'ZUTN', 'series_code', 'EA19.1.0.0.0.ZUTN'); +%$ df = mdbnomics('provider_code', 'AMECO', 'dataset_code', 'ZUTN', 'series_code', 'EA19.1.0.0.0.ZUTN'); %$ t(1) = 1; %$ catch %$ t = 0; @@ -141,7 +154,7 @@ end %@test:2 % test_fetch_series_by_code_mask %$ try -%$ df = fetch_series('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', 'M.FR+DE.PCPIEC_IX+PCPIA_IX'); +%$ df = mdbnomics('provider_code', 'IMF', 'dataset_code', 'CPI', 'series_code', 'M.FR+DE.PCPIEC_IX+PCPIA_IX'); %$ t(1) = 1; %$ catch %$ t = 0; @@ -160,7 +173,7 @@ end %@test:3 % test_fetch_series_by_code_mask_with_plus %$ try -%$ df = fetch_series('provider_code', 'SCB', 'dataset_code', 'AKIAM', 'series_code', '"J+K"+"G+H".AM0301C1'); +%$ df = mdbnomics('provider_code', 'SCB', 'dataset_code', 'AKIAM', 'series_code', '"J+K"+"G+H".AM0301C1'); %$ t(1) = 1; %$ catch %$ t = 0; @@ -179,7 +192,7 @@ end %@test:4 % test_fetch_series_by_dimension %$ try -%$ df = fetch_series('provider_code','WB','dataset_code','DB', 'dimensions', '{"country":["ES","FR","IT"],"indicator":["IC.REG.COST.PC.FE.ZS.DRFN"]}'); +%$ df = mdbnomics('provider_code','WB','dataset_code','DB', 'dimensions', '{"country":["ES","FR","IT"],"indicator":["IC.REG.COST.PC.FE.ZS.DRFN"]}'); %$ t(1) = 1; %$ catch %$ t = 0; @@ -198,7 +211,7 @@ end %@test:5 % test_fetch_series_by_id %$ try -%$ df = fetch_series('series_ids','AMECO/ZUTN/EA19.1.0.0.0.ZUTN'); +%$ df = mdbnomics('series_ids','AMECO/ZUTN/EA19.1.0.0.0.ZUTN'); %$ t(1) = 1; %$ catch %$ t = 0; @@ -218,7 +231,7 @@ end %@test:6 % test_fetch_series_by_ids_in_different_datasets %$ try -%$ df = fetch_series('series_ids', {'AMECO/ZUTN/EA19.1.0.0.0.ZUTN', 'BIS/cbs/Q.S.5A.4B.F.B.A.A.LC1.A.1C'}); +%$ df = mdbnomics('series_ids', {'AMECO/ZUTN/EA19.1.0.0.0.ZUTN', 'BIS/cbs/Q.S.5A.4B.F.B.A.A.LC1.A.1C'}); %$ t(1) = 1; %$ catch %$ t = 0; @@ -244,7 +257,7 @@ end %@test:7 % test_fetch_series_by_ids_in_same_dataset %$ try -%$ df = fetch_series('series_ids', {'AMECO/ZUTN/EA19.1.0.0.0.ZUTN',... +%$ df = mdbnomics('series_ids', {'AMECO/ZUTN/EA19.1.0.0.0.ZUTN',... %$ 'AMECO/ZUTN/DNK.1.0.0.0.ZUTN'}); %$ t(1) = 1; %$ catch @@ -267,7 +280,7 @@ end %@test:8 % test_fetch_series_of_dataset %$ try -%$ df = fetch_series('provider_code', 'AMECO', 'dataset_code', 'ZUTN'); +%$ df = mdbnomics('provider_code', 'AMECO', 'dataset_code', 'ZUTN'); %$ t(1) = 1; %$ catch %$ t = 0; @@ -287,7 +300,7 @@ end %@test:9 % test_fetch_series_with_filter_on_one_series %$ try %$ filters_ = '[{"code": "interpolate", "parameters": {"frequency": "monthly", "method": "spline"}}]'; -%$ df = fetch_series('provider_code', 'AMECO', 'dataset_code', 'ZUTN', 'series_code', 'DEU.1.0.0.0.ZUTN', 'dbnomics_filters', filters_); +%$ df = mdbnomics('provider_code', 'AMECO', 'dataset_code', 'ZUTN', 'series_code', 'DEU.1.0.0.0.ZUTN', 'dbnomics_filters', filters_); %$ t(1) = 1; %$ catch %$ t = 0; @@ -312,7 +325,7 @@ end %@test:10 % test_fetch_series_with_max_nb_series %$ try -%$ df = fetch_series('provider_code', 'AMECO', 'dataset_code', 'ZUTN', 'max_nb_series',20); +%$ df = mdbnomics('provider_code', 'AMECO', 'dataset_code', 'ZUTN', 'max_nb_series',20); %$ t(1) = 1; %$ catch %$ t = 0; @@ -332,7 +345,7 @@ end %@test:11 % test_fetch_series_with_na_values %$ try -%$ df = fetch_series('provider_code', 'AMECO', 'dataset_code', 'ZUTN', 'series_code', 'DEU.1.0.0.0.ZUTN'); +%$ df = mdbnomics('provider_code', 'AMECO', 'dataset_code', 'ZUTN', 'series_code', 'DEU.1.0.0.0.ZUTN'); %$ t(1) = 1; %$ catch %$ t = 0; @@ -351,3 +364,22 @@ end %$ %$ T = all(t); %@eof:11 + +%@test:12 % test_fetch_series_by_api_link +%$ try +%$ df = mdbnomics('api_link', 'https://api.db.nomics.world/v22/series/BIS/long_pp?limit=1000&offset=0&q=&observations=1&align_periods=1&dimensions=%7B%7D'); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(length(unique(df(2:end,2))),1); +%$ t(3) = dassert(df(2,2), {'BIS'}); +%$ t(4) = dassert(length(unique(df(2:end,3))),1); +%$ t(5) = dassert(df(2,3), {'long_pp'}); +%$ t(6) = dassert(length(unique(df(2:end,5))),23); +%$ end +%$ +%$ T = all(t); +%@eof:12 diff --git a/src/mdbnomics_datasets.m b/src/mdbnomics_datasets.m new file mode 100644 index 0000000000000000000000000000000000000000..73f0ffeb95e28e84e358a17c3274bcfec9c26133 --- /dev/null +++ b/src/mdbnomics_datasets.m @@ -0,0 +1,136 @@ +function datasets = mdbnomics_datasets(varargin) % --*-- Unitary tests --*-- +% function mdbnomics_datasets(varargin) +% Downloads the list of available datasets for a selection of providers (or all of them) from https://db.nomics.world/. +% By default, the function returns a structure with a cell array containing the dataset codes and names of the requested providers. +% +% POSSIBLE PARAMETERS +% provider_code [char] DBnomics code of one or multiple providers. If empty, the providers are firstly +% dowloaded with the function mdbnomics_providers and then the available datasets are requested. +% simplify [logical] If true, when the datasets are requested for only one provider then a cell array is returned, not a structure. +% If not provided, the default value is false. +% +% OUTPUTS +% datasets +% +% SPECIAL REQUIREMENTS +% none + +% Copyright (C) 2020 Dynare Team +% +% This file is part of Dynare. +% +% Dynare is free software: you can redistribute it and/or modify +% it under the terms of the GNU General Public License as published by +% the Free Software Foundation, either version 3 of the License, or +% (at your option) any later version. +% +% Dynare is distributed in the hope that it will be useful, +% but WITHOUT ANY WARRANTY; without even the implied warranty of +% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +% GNU General Public License for more details. +% +% You should have received a copy of the GNU General Public License +% along with Dynare. If not, see <http://www.gnu.org/licenses/>. + +global mdb_options + +p = inputParser; +validStringInput = @(x) ischar(x) || iscellstr(x); +p.addParameter('provider_code', '', validStringInput); +p.addParameter('simplify', false, @islogical); +p.KeepUnmatched = false; +p.parse(varargin{:}); + +if isempty(p.Results.provider_code) + provider_code = mdbnomics_providers('code', true); +else + if ischar(p.Results.provider_code) + provider_code = {p.Results.provider_code}; + else + provider_code = p.Results.provider_code; + end +end + +datasets = struct(); +for i = 1:numel(provider_code) + pc = provider_code{i}; + provider_page = sprintf('%s/v%d/providers/%s', mdb_options.api_base_url, mdb_options.api_version, pc); + provider_info = webread(provider_page); + provider_info = provider_info.category_tree; + code = []; + name = []; + if isfield(provider_info, 'children') + unpack_children(provider_info, code, name); + else + try + for n = 1:numel(provider_info) + code = [code, {provider_info{n}.code}]; + name = [name, {provider_info{n}.name}]; + end + catch + for n = 1:numel(provider_info) + code = [code, {provider_info(n).code}]; + name = [name, {provider_info(n).name}]; + end + end + end + datasets.(pc) = horzcat(code', name'); +end + +if p.Results.simplify + if length(fieldnames(datasets)) == 1 + datasets = datasets.(pc); + else + error('Your query corresponds to multiple providers, not possible to simplify'); + end +end +end + +%@test:1 +%$ try +%$ datasets = mdbnomics_datasets('provider_code', 'IMF', 'simplify', true); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(length(unique(datasets(:,1))), 43); +%$ t(3) = dassert(size(datasets, 2), 2); +%$ end +%$ +%$ T = all(t); +%@eof:1 + +%@test:2 +%$ try +%$ datasets = mdbnomics_datasets('provider_code', 'IMF'); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(fieldnames(datasets), {'IMF'}); +%$ t(3) = dassert(size(datasets.IMF,1), 43); +%$ end +%$ +%$ T = all(t); +%@eof:2 + +%@test:3 +%$ try +%$ datasets = mdbnomics_datasets('provider_code', {'IMF', 'AMECO'}); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(fieldnames(datasets), {'IMF'; 'AMECO'}); +%$ t(3) = dassert(size(datasets.IMF,1), 43); +%$ t(4) = dassert(size(datasets.AMECO,1), 473); +%$ end +%$ +%$ T = all(t); +%@eof:3 \ No newline at end of file diff --git a/src/mdbnomics_dimensions.m b/src/mdbnomics_dimensions.m new file mode 100644 index 0000000000000000000000000000000000000000..89b7a3a85c3badcb53700b481af5d0fa2fd61687 --- /dev/null +++ b/src/mdbnomics_dimensions.m @@ -0,0 +1,152 @@ +function dimensions = mdbnomics_dimensions(varargin) % --*-- Unitary tests --*-- +% function mdbnomics_dimensions(varargin) +% Downloads the list of dimensions (if they exist) for available datasets of a selection of providers from https://db.nomics.world/. +% By default, the function returns a structure containing the dimensions of datasets for DBnomics providers. +% +% POSSIBLE PARAMETERS +% provider_code [char] DBnomics code of one or multiple providers. If empty, the providers are firstly +% dowloaded with the function mdbnomics_providers and then the available datasets are requested. +% dataset_code [char] DBnomics code of one or multiple datasets of a provider. If empty, the datasets codes are dowloaded +% with the function mdbnomics_datasets and then the dimensions are requested. +% simplify [logical] If true, when the dimensions are requested for only one provider and one dataset then only the dimension names and their values are provided. +% If not provided, the default value is false. +% +% OUTPUTS +% dimensions +% +% SPECIAL REQUIREMENTS +% none + +% Copyright (C) 2020 Dynare Team +% +% This file is part of Dynare. +% +% Dynare is free software: you can redistribute it and/or modify +% it under the terms of the GNU General Public License as published by +% the Free Software Foundation, either version 3 of the License, or +% (at your option) any later version. +% +% Dynare is distributed in the hope that it will be useful, +% but WITHOUT ANY WARRANTY; without even the implied warranty of +% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +% GNU General Public License for more details. +% +% You should have received a copy of the GNU General Public License +% along with Dynare. If not, see <http://www.gnu.org/licenses/>. + +global mdb_options + +p = inputParser; +validStringInput = @(x) ischar(x) || iscellstr(x); +p.addParameter('provider_code', '', validStringInput); +p.addParameter('dataset_code', '', validStringInput); +p.addParameter('simplify', false, @islogical); +p.KeepUnmatched = false; +p.parse(varargin{:}); + +if isempty(p.Results.provider_code) && ~isempty(p.Results.dataset_code) + error('When you use dataset_code, you must specify provider_code as well.'); +end + +if iscell(p.Results.provider_code) || iscell(p.Results.dataset_code) + if ~isempty(p.Results.provider_code) && ~isempty(p.Results.dataset_code) && length(p.Results.provider_code) ~= length(p.Results.dataset_code) + error('Please specify as many provider codes as dataset codes.') + end +end + +if isempty(p.Results.provider_code) + provider_code = mdbnomics_providers('code', true); +else + if ischar(p.Results.provider_code) + provider_code = {p.Results.provider_code}; + else + provider_code = p.Results.provider_code; + end +end + +if isempty(p.Results.dataset_code) + dataset_code = mdbnomics_datasets('provider_code', provider_code); +else + if ischar(p.Results.dataset_code) + dataset_code = {p.Results.dataset_code}; + else + dataset_code = p.Results.dataset_code; + end +end + +dimensions = struct(); +for i = 1:numel(provider_code) + pc = provider_code{i}; + dc = dataset_code{i}; + dataset_page = sprintf('%s/v%d/datasets/%s/%s', mdb_options.api_base_url, mdb_options.api_version, pc, dc); + dataset_info = webread(dataset_page); + dataset_name = sprintf('%s_%s', pc, dc); + + try + tmp1 = dataset_info.datasets.docs.dimensions_labels; + catch + try + tmp1 = dataset_info.datasets.(dataset_name).dimensions_labels; + catch + tmp1 = {}; + end + end + + try + tmp2 = dataset_info.datasets.docs.dimensions_values_labels; + catch + try + tmp2 = dataset_info.datasets.(dataset_name).dimensions_values_labels; + catch + tmp2 = {}; + end + end + + dataset_dimensions = fieldnames(tmp1); + for d = 1:numel(dataset_dimensions) + dimensions.(dataset_name).(dataset_dimensions{d}) = tmp2.(dataset_dimensions{d}); + end +end + +if p.Results.simplify + if length(fieldnames(dimensions)) == 1 + dimensions = dimensions.(dataset_name); + else + error('Your query corresponds to multiple datasets, not possible to simplify'); + end +end +end + +%@test:1 +%$ try +%$ dimensions = mdbnomics_dimensions('provider_code', 'IMF', 'dataset_code', 'WEO'); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(fieldnames(dimensions), {'IMF_WEO'}); +%$ t(3) = dassert(isfield(dimensions.IMF_WEO, 'unit'), true); +%$ t(4) = dassert(length(fieldnames(dimensions.IMF_WEO.unit)), 13); +%$ end +%$ +%$ T = all(t); +%@eof:1 + +%@test:2 +%$ try +%$ dimensions = mdbnomics_dimensions('provider_code', 'IMF', 'dataset_code', 'WEO', 'simplify', true); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(isfield(dimensions, 'IMF_WEO'), false); +%$ t(3) = dassert(isfield(dimensions, 'unit'), true); +%$ t(4) = dassert(length(fieldnames(dimensions.unit)), 13); +%$ end +%$ +%$ T = all(t); +%@eof:2 diff --git a/src/mdbnomics_series.m b/src/mdbnomics_series.m new file mode 100644 index 0000000000000000000000000000000000000000..3b5c0107fc66cea6a9d7547dfea5380d3bba1958 --- /dev/null +++ b/src/mdbnomics_series.m @@ -0,0 +1,237 @@ +function series = mdbnomics_series(varargin) % --*-- Unitary tests --*-- +% function mdbnomics_series(varargin) +% Downloads the list of series for available datasets of a selection of providers from https://db.nomics.world/. +% We warn the user that this function can be (very) long to execute! +% We remind that DBnomics requests data from 63 providers to retrieve 21675 datasets for a total of approximately 720 millions series. +% By default, the function returns a structure with a cell array at the end of each branch containing the series codes and names of datasets for DBnomics providers. +% +% POSSIBLE PARAMETERS +% provider_code [char] DBnomics code of one or multiple providers. If empty, the providers are firstly +% dowloaded with the function mdbnomics_providers and then the available datasets are requested. +% dataset_code [char] DBnomics code of one or multiple datasets of a provider. If empty, the datasets codes are dowloaded +% with the function mdbnomics_datasets and then the dimensions are requested. +% dimensions [char] DBnomics code of one or several dimensions in the specified provider and dataset. +% If provided it must be a string formatted like: '{"country":["ES","FR","IT"],"indicator":["IC.REG.COST.PC.FE.ZS.DRFN"]}'. +% query [char] A query to filter/select series from a provider's dataset. +% only_number_of_series [logical] If true, only the number of series for the given query will be printed in the command window. +% If not provided, the default value is false. +% simplify [logical] If true, when the datasets are requested for only one provider then a cell array is returned, not a structure. +% If not provided, the default value is false. +% +% OUTPUTS +% series +% +% SPECIAL REQUIREMENTS +% none + +% Copyright (C) 2020 Dynare Team +% +% This file is part of Dynare. +% +% Dynare is free software: you can redistribute it and/or modify +% it under the terms of the GNU General Public License as published by +% the Free Software Foundation, either version 3 of the License, or +% (at your option) any later version. +% +% Dynare is distributed in the hope that it will be useful, +% but WITHOUT ANY WARRANTY; without even the implied warranty of +% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +% GNU General Public License for more details. +% +% You should have received a copy of the GNU General Public License +% along with Dynare. If not, see <http://www.gnu.org/licenses/>. + +global mdb_options + +p = inputParser; +validStringInput = @(x) ischar(x) || iscellstr(x); +p.addParameter('provider_code', '', validStringInput); +p.addParameter('dataset_code', '', validStringInput); +p.addParameter('dimensions', '', validStringInput); +p.addParameter('query', '', validStringInput); +p.addParameter('only_number_of_series', false, @islogical); +p.addParameter('simplify', false, @islogical); +p.KeepUnmatched = false; +p.parse(varargin{:}); + +if iscell(p.Results.provider_code) || iscell(p.Results.dataset_code) + if ~isempty(p.Results.provider_code) && ~isempty(p.Results.dataset_code) && length(p.Results.provider_code) ~= length(p.Results.dataset_code) + error('Please specify as many provider codes as dataset codes.') + end +end + +if isempty(p.Results.provider_code) + provider_code = mdbnomics_providers('code', true); +else + if ischar(p.Results.provider_code) + provider_code = {p.Results.provider_code}; + else + provider_code = p.Results.provider_code; + end +end + +if isempty(p.Results.dataset_code) + dataset_code = mdbnomics_datasets('provider_code', provider_code); +else + if ischar(p.Results.dataset_code) + dataset_code = {p.Results.dataset_code}; + else + dataset_code = p.Results.dataset_code; + end +end + +if ~isempty(p.Results.query) + if ischar(p.Results.query) + db_query = {p.Results.query}; + else + db_query = p.Results.query; + end +end + +if ~isempty(p.Results.dimensions) + if ischar(p.Results.dimensions) + dimensions = {p.Results.dimensions}; + else + dimensions = p.Results.dimensions; + end +end + +series = struct(); +for i = 1:numel(provider_code) + pc = provider_code{i}; + dc = dataset_code{i}; + dataset_page = sprintf('%s/v%d/series/%s/%s', mdb_options.api_base_url, mdb_options.api_version, pc, dc); + if exist('db_query', 'var') + dataset_page = sprintf('%s?q=%s', dataset_page, db_query{i}); + end + + if exist('dimensions', 'var') + if contains(dimensions{i}, '\\?') + spec = '&'; + else + spec = '?'; + end + dataset_page = sprintf('%s%sdimensions=%s', dataset_page, spec, dimensions{i}); + end + + dataset_info = webread(dataset_page); + dataset_name = sprintf('%s_%s', pc, dc); + limit = dataset_info.series.limit; + num_found = dataset_info.series.num_found; + + if p.Results.only_number_of_series + sprintf('Number of series = %d', num_found) + return + else + sprintf('The dataset %s from provider %s contains %d series.', dc, pc, num_found) + series_code = []; + series_name = []; + + if num_found > limit + sequence = 0:1:floor(num_found/limit); + + if contains(dataset_page, 'offset=') + dataset_page = regexprep(dataset_page, '\\&offset=[0-9]+', ''); + dataset_page = regexprep(dataset_page, '\\?offset=[0-9]+', ''); + end + + if contains(dataset_page, '\\?') + sep = '&'; + else + sep = '?'; + end + + for j = 1:numel(sequence) + tmp_api_link = sprintf('%s%soffset=%d', dataset_page, sep, sequence(j)*limit); + dataset_info = webread(tmp_api_link); + series_info = dataset_info.series.docs; + for s = 1:numel(series_info) + series_code = [series_code, {series_info(s).series_code}]; + series_name = [series_name, {series_info(s).series_name}]; + end + end + series.(dataset_name) = horzcat(series_code', series_name'); + else + series_info = dataset_info.series.docs; + for s = 1:numel(series_info) + series_code = [series_code, {series_info(s).series_code}]; + series_name = [series_name, {series_info(s).series_name}]; + end + series.(dataset_name) = horzcat(series_code', series_name'); + end + end +end + +if p.Results.simplify + if length(fieldnames(series)) == 1 + series = series.(dataset_name); + else + error('Your query corresponds to multiple datasets, not possible to simplify'); + end +end +end + +%@test:1 +%$ try +%$ series = mdbnomics_series('provider_code', 'IMF', 'dataset_code', 'WEO', 'simplify', true); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(length(unique(series(:,1))), 8924); +%$ t(3) = dassert(size(series, 2), 2); +%$ end +%$ +%$ T = all(t); +%@eof:1 + +%@test:2 +%$ try +%$ series = mdbnomics_series('provider_code', 'IMF', 'dataset_code', 'WEO'); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(fieldnames(series), {'IMF_WEO'}); +%$ t(3) = dassert(length(unique(series.IMF_WEO(:,1))), 8924); +%$ t(4) = dassert(size(series.IMF_WEO, 2), 2); +%$ end +%$ +%$ T = all(t); +%@eof:2 + +%@test:3 +%$ try +%$ series = mdbnomics_series('provider_code', 'IMF', 'dataset_code', 'WEO', 'dimensions', '{"weo-subject":["NGDP_RPCH"]}', 'simplify', true); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(length(unique(series(:,1))), 194); +%$ t(3) = dassert(size(series, 2), 2); +%$ end +%$ +%$ T = all(t); +%@eof:3 + +%@test:4 +%$ try +%$ series = mdbnomics_series('provider_code', 'IMF', 'dataset_code', 'WEO', 'query', 'NGDP_RPCH'); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(length(unique(series.IMF_WEO(:,1))), 194); +%$ t(3) = dassert(size(series.IMF_WEO, 2), 2); +%$ end +%$ +%$ T = all(t); +%@eof:4 \ No newline at end of file diff --git a/src/fetch_series_by_api_link.m b/src/utils/fetch_series_by_api_link.m similarity index 98% rename from src/fetch_series_by_api_link.m rename to src/utils/fetch_series_by_api_link.m index d87f3f53c18effac35c0b09eec1b7624e53aec22..0e71a41bbc87146d047b36ee60df46f15e59e0ec 100644 --- a/src/fetch_series_by_api_link.m +++ b/src/utils/fetch_series_by_api_link.m @@ -37,7 +37,10 @@ function df = fetch_series_by_api_link(api_link, varargin) % --*-- Unitary tests % % You should have received a copy of the GNU General Public License % along with Dynare. If not, see <http://www.gnu.org/licenses/>. -default_editor_base_url = 'https://editor.nomics.world/api/v1/'; + +global mdb_options + +default_editor_base_url = sprintf('%s/api/v%d/', mdb_options.editor_base_url, mdb_options.editor_version); p = inputParser; validStringInput = @(x) ischar(x) || iscellstr(x); diff --git a/src/utils/mdbnomics_providers.m b/src/utils/mdbnomics_providers.m new file mode 100644 index 0000000000000000000000000000000000000000..cdd693b77308b97dbdfd6afbe69f7d5d783d6041 --- /dev/null +++ b/src/utils/mdbnomics_providers.m @@ -0,0 +1,105 @@ +function providers = mdbnomics_providers(varargin) % --*-- Unitary tests --*-- +% function mdbnomics_providers(varargin) +% Downloads the list of DBnomics providers from https://db.nomics.world/. +% By default, the function returns a cell array containing the list of providers +% with additional informations such as the region, the website, etc. +% +% POSSIBLE PARAMETERS +% code [logical] If true, then only the providers are returned in a vector. If not provided, the default value is false. +% +% OUTPUTS +% providers +% +% SPECIAL REQUIREMENTS +% none + +% Copyright (C) 2020 Dynare Team +% +% This file is part of Dynare. +% +% Dynare is free software: you can redistribute it and/or modify +% it under the terms of the GNU General Public License as published by +% the Free Software Foundation, either version 3 of the License, or +% (at your option) any later version. +% +% Dynare is distributed in the hope that it will be useful, +% but WITHOUT ANY WARRANTY; without even the implied warranty of +% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +% GNU General Public License for more details. +% +% You should have received a copy of the GNU General Public License +% along with Dynare. If not, see <http://www.gnu.org/licenses/>. + +global mdb_options + +p = inputParser; +p.addParameter('code', false, @islogical); +p.KeepUnmatched = false; +p.parse(varargin{:}); + +providers_url = sprintf('%s/v%d/providers', mdb_options.api_base_url, mdb_options.api_version); +response = webread(providers_url); + +if p.Results.code + providers = cell(size(response.providers.docs, 1),1); + for i = 1:size(response.providers.docs, 1) + providers{i} = response.providers.docs{i}.code; + end + providers(cellfun('isempty',providers)) = []; +else + providers = response.providers.docs; +end +end + +%@test:1 +%$ try +%$ providers = mdbnomics_providers('code', true); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(length(unique(providers)), 67); +%$ t(3) = dassert(providers{1}, 'AFDB'); +%$ t(4) = dassert(providers{67}, 'WTO'); +%$ end +%$ +%$ T = all(t); +%@eof:1 + +%@test:2 +%$ try +%$ providers = mdbnomics_providers('code', false); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(isstruct(providers{1}), true); +%$ t(3) = dassert(size(providers,1), 67); +%$ t(4) = dassert(providers{1}.code, 'AFDB'); +%$ t(5) = dassert(providers{67}.code, 'WTO'); +%$ end +%$ +%$ T = all(t); +%@eof:2 + +%@test:3 +%$ try +%$ providers = mdbnomics_providers(); +%$ t(1) = 1; +%$ catch +%$ t = 0; +%$ end +%$ +%$ if t(1) +%$ t(2) = dassert(isstruct(providers{1}), true); +%$ t(3) = dassert(size(providers,1), 67); +%$ t(4) = dassert(providers{1}.code, 'AFDB'); +%$ t(5) = dassert(providers{67}.code, 'WTO'); +%$ end +%$ +%$ T = all(t); +%@eof:3 diff --git a/src/utils/unpack_children.m b/src/utils/unpack_children.m new file mode 100644 index 0000000000000000000000000000000000000000..fbf35db524ca2890b45ae16d64589ddbd76cf722 --- /dev/null +++ b/src/utils/unpack_children.m @@ -0,0 +1,51 @@ +function unpack_children(structure, code, name) +% function unpack_children(structure, code, name) +% Recursively unpacks nested strcuture with fieldname 'children'. +% Returns arrays of dataset codes and names to the caller workspace. +% +% INPUTS +% structure [struct] structure to be unpacked +% code [array] array of dataset codes +% name [array] array of dataset names +% +% OUTPUTS +% code, name in caller workspace +% +% SPECIAL REQUIREMENTS +% none + +% Copyright (C) 2020 Dynare Team +% +% This file is part of Dynare. +% +% Dynare is free software: you can redistribute it and/or modify +% it under the terms of the GNU General Public License as published by +% the Free Software Foundation, either version 3 of the License, or +% (at your option) any later version. +% +% Dynare is distributed in the hope that it will be useful, +% but WITHOUT ANY WARRANTY; without even the implied warranty of +% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +% GNU General Public License for more details. +% +% You should have received a copy of the GNU General Public License +% along with Dynare. If not, see <http://www.gnu.org/licenses/>. + +for i = 1:numel(structure) + if isfield(structure(i), 'children') + child = structure(i).children; + if isstruct(child) + unpack_children(child, code, name) + continue; + end + else + code = [code, {structure(i).code}]; + name = [name, {structure(i).name}]; + end +end + +assignin('caller', 'code', code); +assignin('caller', 'name', name); +end + +