============================================== Historical Price Service ============================================== Welcome to this walkthrough on the Tyba Historical API. By the end of this guide, you'll know how to retrieve historical energy and ancillary prices. Also, the various code objects mentioned here all contain hyperlinks to the relevant parts of the :doc:`Client API Reference `, so make sure to click on them for more information! Setting Up ----------------- First, ensure you've :doc:`installed the Tyba Client `. After :doc:`instantiating a Tyba client instance`, one can access the historical price client by calling :attr:`client.services ` and its sub interfaces :attr:`~tyba_client.client.Services.lmp` and :attr:`~tyba_client.client.Services.ancillary` .. code-block:: python import os import pandas as pd from tyba_client.client import Client PAT = os.environ["TYBA_PAT"] client = Client(PAT) services = client.services lmp = services.lmp ancillary = services.ancillary Fetching Available ISOs ------------------------ The main use of the :class:`services ` interface is the :meth:`~tyba_client.client.Services.get_all_isos` method. This method lists all of the regional transmission operators and independent system operators (all generally referred to as ISOs) that are represented in Tyba's historical price dataset. Even if we are confident that the ISO we care about is in the dataset, we usually want to start here to make sure that we have the right string identifier for our ISO. .. code-block:: python isos = services.get_all_isos().json() print(isos) #returns ['PJMISO', 'CAISO', 'NEISO', 'NYISO', 'ERCOT', 'SPPISO', 'MISO'] Note that :meth:`~tyba_client.client.Services.get_all_isos` actually returns a :class:`requests.Response` object, so we have to extract the useful JSON data from the body of the response. **This is the pattern used for all of the methods in the historical forecast API** Fetching Historical Energy Prices For a Single Node ------------------------------------------------------ Let's suppose we're interested in HB_HOUSTON's day-ahead energy prices from 2020 through 2021. First, we use the :meth:`lmp.search_nodes ` method to find the relevant node id .. code-block:: python res = lmp.search_nodes(node_name_filter="HB_HOUSTON") res.raise_for_status() # this will raise an error if our search request was unsuccessful nodes_list = res.json()['nodes'] print(nodes_list) #returns [{'node/id': '10000697077', 'node/name': 'HB_HOUSTON', 'node/iso': 'ERCOT', 'node/lat': 29.852151, 'node/lng': -95.477535}] In this case, our node search returned data for a single node since we applied such a specific name filter. We can then extract the node id and use it to get a `pandas `__ Series of day-ahead prices with the :meth:`lmp.get_prices ` method .. code-block:: python from tyba_client.client import Market import pandas as pd node_id = nodes_list[0]["node/id"] res = lmp.get_prices( node_ids=[node_id], # note this has to be a list, even if there is just a single ID market=Market.DA, start_year=2020, end_year=2021, ) res.raise_for_status() price_data = res.json()[node_id] # results are also organized by node ID price_data = pd.Series(index=pd.to_datetime(price_data["datetimes"]), data=price_data["prices"]) print(price_data.head()) #returns 2020-01-01 00:00:00 11.27 2020-01-01 01:00:00 9.73 2020-01-01 02:00:00 9.58 2020-01-01 03:00:00 9.65 2020-01-01 04:00:00 10.18 Working with Node Data ------------------------------------------------------- The :meth:`~tyba_client.client.LMP.search_nodes` method provides the ability to search based on location, node name and ISO. For example, we can look at all the nodes around a particular location: .. code-block:: python res = lmp.search_nodes(location="29.760427, -95.369804") res.raise_for_status() # this will raise an error if our search request was unsuccessful node_search_df = pd.DataFrame(res.json()['nodes']) print(node_search_df.head()) #returns node/lng node/iso node/name node/lat node/id node/distance-meters 0 -95.379870 ERCOT CR_RN 29.776851 10018267019 2061.858094 1 -95.398456 ERCOT BRHEIGHT_RN 29.782491 10016437277 3692.691396 2 -95.399247 ERCOT TC_GTG2_RN 29.703634 10017407316 6909.874096 3 -95.399247 ERCOT TC_PUN1 29.703634 10000698825 6909.874096 4 -95.399247 ERCOT TC_PUN2 29.703634 10000698826 6909.874096 The data in ``node_search_df`` can be used to determine relevant nodes near a project site and generally for mapping/plotting purposes. However, note that the node data returned is related to the use of the ``location`` search parameter. To access other node data such as data start and end years, node type etc. we can use the :meth:`lmp.get_all_nodes ` method .. code-block:: python res = lmp.get_all_nodes(iso="ERCOT") res.raise_for_status() node_df = pd.DataFrame(res.json()) print(node_df.head()) #returns da_start_year rt_end_year zone name type substation da_end_year id rt_start_year 0 2011.0 2025 SOUTH OXY_OXY_CCG1 GENERATOR OXYCHEM CC 2025.0 10000698721 2011 1 2022.0 2025 SOUTH SLCN_ESS_ALL GENERATOR NaN 2025.0 10016671726 2022 2 2024.0 2025 SOUTH ANEM_ESS_RN GENERATOR NaN 2025.0 10017318833 2024 3 2011.0 2025 NORTH WHCCS_CC1 GENERATOR WOLF HOLLOW GEN 2025.0 10000698379 2011 4 2011.0 2025 HOUSTON SCLP_PUN4 GENERATOR SWEENY COGEN 2025.0 10000698780 2011 Note that ``iso`` is a keyword argument, so simply typing ``lmp.get_all_nodes("ERCOT")`` won't work. The :meth:`~tyba_client.client.LMP.search_nodes` and :meth:`~tyba_client.client.LMP.get_all_nodes` methods can be used together for more complex node filtering. For example, let's suppose that we want the 3 closest unique nodes to our location that also have at least 3 years of real time price data .. code-block:: python node_df = node_df.set_index("id") # set the id column as the index node_search_df = node_search_df.set_index("node/id") node_df = pd.concat([node_df, node_search_df], axis=1, join="inner") # combine the 2 dfs by id, leaving only the subset in node_search_df node_df.index.name = "id" node_df = node_df[node_df["rt_end_year"] - node_df["rt_start_year"] >= 3] # filter for data span node_df["latlon"] = node_df["node/lng"].astype(str) + node_df["node/lat"].astype(str) node_df = new_df.reset_index().groupby("latlon").first().set_index("id").sort_values(by="node/distance-meters") # filter for unique lat/lon print(node_df[["rt_start_year", "rt_end_year", "node/distance-meters"]].head()) #returns rt_start_year rt_end_year node/distance-meters id 10016437277 2021 2025 3692.691396 10000698826 2011 2025 6909.874096 10000698744 2011 2024 12913.743104 10016232998 2021 2025 13086.640441 10016437499 2021 2024 14409.202592 Fetching Historical Energy Prices For Multiple Nodes ------------------------------------------------------- We can pass multiple node IDs to the :meth:`lmp.get_prices ` method. Continuing the example from the previous section, let's suppose that we are interested in real-time energy price data for the top 3 nodes in ``node_df`` and generate a real time price dataframe organized by node name .. code-block:: python node_df = node_df.reset_index() node_df = node_df.loc[:2, :] ids_names = dict(zip(node_df["id"], node_df["name"])) res = lmp.get_prices( node_ids=node_df["id"].values.tolist(), market=Market.RT, start_year=2021, end_year=2023, ) res.raise_for_status() price_data = res.json() price_df = pd.concat([pd.Series(index=v["datetimes"], data=v["prices"], name=k) for k, v in price_data.items()], axis=1) price_df = price_df.rename(columns=ids_names) print(price_df.head()) #returns BRHEIGHT_RN TC_PUN2 PR_PR_G2 2021-01-01T00:00:00 15.4500 15.4375 15.4125 2021-01-01T01:00:00 14.8725 14.8750 14.8700 2021-01-01T02:00:00 14.5350 14.5375 14.5350 2021-01-01T03:00:00 15.8525 15.8475 15.8350 2021-01-01T04:00:00 17.0200 16.9800 16.9175 Please consult the :class:`LMP docs ` for more details and the `Example Notebooks`_ for more examples. Fetch Historical Ancillary Prices for a Single Region ------------------------------------------------------- Let's suppose we're interested in Day-Ahead Regulation Up prices in the SP15 region in CAISO for the year 2020. For this task, we would use the :meth:`ancillary.get_prices ` method .. code-block:: python from tyba_client.models import AncillaryService res = ancillary.get_prices( iso="CAISO", service=AncillaryService.REGULATION_UP, market=Market.DA, region="SP15", start_year=2020, end_year=2020, ) res.raise_for_status() as_prices = res.json() as_prices = pd.Series(index=pd.to_datetime(as_prices["datetimes"]).tz_localize(None), data=as_prices["prices"]) print(as_prices.head()) # returns 2020-01-01 00:00:00 12.27 2020-01-01 01:00:00 6.34 2020-01-01 02:00:00 6.49 2020-01-01 03:00:00 10.49 2020-01-01 04:00:00 8.32 Fetch Ancillary Prices for Multiple Regions, Services, and Markets ------------------------------------------------------------------- Unlike the :meth:`lmp.get_prices ` method, :meth:`ancillary.get_prices ` only pulls a single price time series. As such, we pull multiple price series by iterating and combining afterwards. For example, let's fetch Day-Ahead and Real-Time prices for Regulation Up and Down for both the SP15 and NP15 regions in CAISO in 2022 .. code-block:: python regions = dict() for region in ["SP15", "NP15"]: services = dict() for service in [AncillaryService.REGULATION_DOWN, AncillaryService.REGULATION_UP]: market_df = pd.DataFrame() for market in [Market.DA, Market.RT]: as_prices = ancillary.get_prices( iso="CAISO", service=service, market=market, region=region, start_year=2022, end_year=2022, ).json() market_df[market] = pd.Series(index=pd.to_datetime(as_prices["datetimes"]).tz_localize(None), data=as_prices["prices"]) # df of market prices for each region-service combo services[service] = market_df regions[region] = pd.concat(services, axis=1) # df for each region as_df = pd.concat(regions, axis=1) # df combining regions print(as_df.head()) # returns SP15 NP15 Regulation Down Regulation Up Regulation Down Regulation Up dayahead realtime dayahead realtime dayahead realtime dayahead realtime 2022-01-01 00:00:00 6.35 0.0 1.70 3.30 0.75 0.0 3.00 2.45 2022-01-01 01:00:00 3.00 0.0 1.28 1.50 3.00 0.0 1.88 1.88 2022-01-01 02:00:00 2.66 0.0 1.28 1.70 5.01 0.0 3.42 8.04 2022-01-01 03:00:00 1.44 0.0 1.28 2.56 5.02 0.0 3.42 2.41 2022-01-01 04:00:00 1.44 0.0 1.28 2.00 12.84 0.0 3.42 3.00 Working with Ancillary Region Data ----------------------------------- The :meth:`~tyba_client.client.Ancillary.get_prices` method requires 5 data points to fetch prices: ISO, region, service, market, and date range. In the above examples, we knew which pricing regions we were interested in and, implicitly, that the combinations of region, service, market and year were available in the dataset. In situations where we don't know what date ranges or even regions are available, we can use the :meth:`ancillary.get_pricing_regions ` method to find out. We specify 3 of the input data points (ISO, service and market) and fetch the other 2 (available regions and their respective date ranges) .. code-block:: python res = ancillary.get_pricing_regions( iso="CAISO", service=AncillaryService.REGULATION_UP, market=Market.DA) res.raise_for_status() as_regions = pd.DataFrame(res.json()) print(as_regions.head()) # returns region start_year end_year 0 Pacific Northwest - SP15 2010 2025 1 WAPA 2010 2025 2 WAPA 2010 2025 3 WAPA 2010 2025 4 ZP26 2010 2025 More complex analyses can be performed by combining the results of multiple calls to :meth:`~tyba_client.client.Ancillary.get_pricing_regions`. For example, let's suppose we want to know the regions and date ranges that are available in CAISO for real-time and both regulation up and down .. code-block:: python import numpy as np dfs = dict() for service in [AncillaryService.REGULATION_UP, AncillaryService.REGULATION_DOWN]: res = ancillary.get_pricing_regions( iso="CAISO", service=AncillaryService.REGULATION_UP, market=Market.RT) res.raise_for_status() dfs[service] = pd.DataFrame(res.json()).drop_duplicates() # currently duplicates returned in results as_regions = dfs[AncillaryService.REGULATION_UP].merge(dfs[AncillaryService.REGULATION_DOWN], on="region", suffixes=["_reg_up", "_reg_down"]) as_regions["min_start_year"] = np.maximum(as_regions["start_year_reg_up"], as_regions["start_year_reg_down"]) as_regions["max_end_year"] = np.minimum(as_regions["end_year_reg_up"], as_regions["end_year_reg_down"]) print(as_regions[["region", "min_start_year", "max_end_year"]].head()) # returns region min_start_year max_end_year 0 Arizona 2011 2025 1 LADWP 2011 2025 2 Mexico 2011 2025 3 Nevada - NP26 2011 2025 4 Nevada - SP15 2011 2025 Please consult the :class:`Ancillary docs ` for more details and the `Example Notebooks`_ below for more examples. Example Notebooks ----------------- .. nbgallery:: :glob: example-hist-notebooks/*