Menu

Trading strategies data mining

3 Comments

trading strategies data mining

Our business runs a stock advisory service. Some of our clients trade our recommended stocks at a loss by buying at a high price and selling at a low price. We need some very simple trading models with comparative results to show clients how they can grow the value of their investment accounts with our stock picks. Data mining - especially the pre-processing, preliminary data analysis, and reporting steps - can be very helpful for addressing your problem. This tip examines six stocks, such as those that might be recommended by your firm, to present the outcome of three different trading strategies. The stock trading strategies are purposely simple so that they can be readily programmed with T-SQL as well as understood by your stock trading clients. All strategies covered are consistent with strategies clients how they can grow the value of their investment accounts. The source data for the tip are stock prices from the Google Mining site. These data are available for free. The tip illustrates how to download stock prices to csv files. Then, the data are transferred to SQL Server tables. After transferring the stock price data to a SQL Server database, they are pre-processed to make them suitable for examining how each of the three stock trading strategies perform. One stock trading strategy depends on moving averages for stock prices. This tip includes a simple explanation of moving averages as well as an easy way to compute them for stock price data. The final tip element compares gains and losses from the trading the trading strategies. The comparisons are for the six stocks individually and overall. Comparisons are computed on a per share basis as well share lot basis. A share lot is a set of shares for a stock that are bought and sold as a unit. You can download historical end-of-day price and volume for a stock from the Google Finance site. Simply enter a URL with parameters, including the stock's symbol as well as the start and end dates for the range of data that you seek. You also need to specify the output format. Your browser, such as Chrome or Internet Explorer, will retrieve the daily stock prices and volumes to a file on your computer. The following URL specifies the retrieval of historical price and volume data for a stock with the symbol crus. The data starts as of the first trading day on or after January 1, through trading the last trading day for which data are available up until August 31, The download was taken during the morning on August 24, The Google Finance site automatically names the downloaded file with the symbol specified in the URL for example, crus. Here's a screen shot from Excel showing the first 20 rows of data in the csv file named crus. Notice that there is a separate row for each trading day ending on August 23,the last trading day for mining end-of-day data was available. Aside from the Date column, there are four columns specifying money values Open, High, Low, Close and a fifth column shows shares traded on a date for the stock designated by the symbol. While Excel automatically transforms the character data in the crus. Data for six stock symbols were downloaded for this tip. The following bullets show and the symbols along with the corresponding company names and short descriptions. You can see from the descriptions that great diversity is readily available. An ADR is a stock that trades in the United States but represents a specified number of shares in a foreign corporation meet is for MeetMe, Inc. The company makes available social networks for meeting new people in the US and in Latin America orly is for O'Reilly Automotive Inc, a retailer of automotive parts and accessories stmp is for Stamps. This firm sells cosmetics, fragrances, skin and hair care products, appliances, and accessories. Then, it populates the SQL Server tables from the downloaded csv files - one table per file. If you do not have the database on your server, running the following script can create the database. Because the data in the csv files are character-based, you must transform them before you can use them as dates, money, or integers in SQL Server. There are several ways to perform the transformations. This tip demonstrates how to use built-in SSIS transformation features to convert the data to an appropriate data type. The following screen shot shows a Control Flow view of the SSIS project for data mining the stock price data downloaded from the Google Finance site. Some annotation text and two steps are highlighted. The highlighted content is for importing the downloaded data into SQL Server. Additionally, there are 7 connection managers displayed below the control flow area. There are six Execute SQL Task steps in the container - one for each table to receive downloaded data from a csv file. All tables have the same specification except for the symbol name. When you open a flat file connection manager for the orly. It also expects column headers in the first trading, which corresponds to the downloaded format. Therefore, no changes are required to the default General tab that appears below other than assigning a connection manager name. This tip uses settings on the Advanced tab to indicate SSIS data types for reading csv file data in a way that is suitable for transferring them to data types in SQL Server tables. Three distinct formats are designated as indicated in the screen shots below. However, the data for the orly symbol failed to load successfully on the first try. Notice that the bad data were for the Open, High, and Low columns. The data for the Date and Close column values appeared valid, but the data for the Volume strategies did not appear valid. However, this tip only requires valid, correct Date and Close column values. It was the bad data for the Open, High, and Low columns that caused the importing of the orly. You can bypass the failure by configuring settings within the Flat File Source for the orly csv step to assign NULL values to columns with invalid data during the import process. The following screen shot shows the Ignore failure settings for the Open, High, and Low columns in the orly. With these configuration changes in the Error Data tab of the Flat File Source for orly csv step, it was possible to load the orly. Up to this point, the imported data exists in six separate tables with dates going back to the first trading date in or whenever the Google Finance site first started reporting stock prices for a symbol. Aside from variations in the start date for different symbols, I sometimes noticed dates with rows data data for one symbol that were missing for other symbols. For the six stock symbols used in this tip, these missing dates were very rare and limited to data prior to The missing data are not relevant to this tip because stock trading rules were compared only for trading strategies in the month window from July through July In addition, to pre-processing filters for the date range, other pre-processing was implemented. Data were consolidated from six separate tables without a symbol indicator into one table with a distinct symbol indicator for each stock. This step makes it easier to evaluate trading strategies across all six stocks Also, day and day moving averages were computed from the base stock price data. Moving averages are a common technical analysis tool for analyzing stock trends. The day moving average reflects short-term trends, and the day moving average reflects longer-term trends The following T-SQL script shows code for selecting data from the crus table, adding a symbol column, and computing day and day moving averages. This link compares 6 different methods for computing a sum over a rolling window, such as 10 days or 30 days. The method with the best performance used a customized version mining the SUM function that depended on PRECEDING and CURRENT ROW key words. The following code illustrates an adaptation of the best method for computing moving averages instead of sums over a rolling window. The following screen shot displays an excerpt from the result set for the preceding script. Notice that values for these columns do not start until the tenth and thirtieth rows, respectively. This is because a day moving average requires at least 10 preceding days of values, and a day moving average requires at least 30 days of preceding values. Rolling windows determine which dates contribute to the moving average values on each row. There are stock trading days from July data July The following screen shot shows the first 31 rows in the table. Because the data are arranged by Date within Symbol, these rows are for the earliest 31 trading days for the crus symbol. Because the moving averages are computed on values going back as far asthere are no NULL values for either the day or day moving averages. Before moving to the trading pre-processing step for the stock price data, it will be helpful to review the three trading strategies evaluated in this tip. Recall that the objective is to evaluate simple trading rules because we want the rules to be trading understood by clients of the stock advisory service. Also, we seek rules which are safe to trade - so that stock price gains go up if stock prices rise throughout an evaluation period we'll site a couple of examples where a trade data lose money even while a stock price is rising over an extended period. This tip evaluates trading strategies for the month period from July through July Additionally, we need rules that can be readily compared to one another. The rules examined in this tip are more like benchmarks for contrasting strategies styles than precise recommendations on how to trade stocks. The three trading rules evaluated by this tip are as follow. The overall buy-and-hold rule buys shares for a stock at the Open price of the first trading day in July and sells those shares at the Close price on the last trading day in July The monthly buy-and-hold rule buys shares for a stock at the Open price for each month from July through July On the last trading day of each month, the rule sells the stocks purchased at the beginning of the month The conditional buy-and-hold strategy buys shares for a stock only when the short-term price trend at the beginning of month as indicated by the day moving average is greater than the long-term trend as indicated by the day moving average. Otherwise, no purchase is made for a stock during the month. To keep the code simple for this trading rule comparison, we assume that you know the day and day moving average values for the month's first trading day at the start of a month's first day Trading plans are compared on a share price change as well as a lot price change basis. Share price change is computed based on the difference between the Open price on first trade for a period versus the Close price on the last day of a period. If the Close price is greater than the Open price, then rule results in a gain. Otherwise, the rule breaks even or loses money The start and end days change from one rule to the next For the buy-and-hold rule, there is just one start date and one end date. The start date is the first trading day at the beginning of July The end date is the last trading data of July For the monthly buy-and-hold rule, there is start date and end date for each of the 25 months from July through July Within each month, the start date is always the first trading day of the month, and the end date is always the last trading day of the month For the conditional buy-and-hold rule, there are a variable number of months in which stocks can be bought and sold. Stocks are only bought on months strategies the day moving average is greater than the day moving average at the beginning of the month. If there is a stock purchase at the start of a month with this rule, then the close price is for the last trading day in the month Recall that the term lot refers to the collection of stock shares bought during a trade. The term lot size indicates the number of stock shares bought mining one time. The lot size varies from one stock to the next within a month. The lot sizes computed for the monthly buy-and-hold rule are also used for the conditional buy-and-hold rule The overall buy-and-hold rule uses the monthly buy-and-hold lot size of July for its first buy. Because at the end of 25 months you sell the shares acquired in the initial purchase with the overall buy-and-hold rule, the lot size sold in July is the same as the shares bought in July The conditional buy-and-hold rule uses the monthly buy-and-hold lot sizes. Notice that all 25 rows are for the crus symbol. For example, the Close column value less the Open column value indicates the gain or loss per share for the month with the monthly buy-and-hold rule. The conditional buy-and-hold rule only uses the difference between the Close and Open column values for months in which the day moving average is greater than the day moving average. By summing the gain or loss for each month in which there was a trade, data can derive the gain or loss across all 25 months for which stock prices are tracked. The difference between the Close price for July less the Open price for July returns the gain or loss for a stock. Just as with the monthly rules, you compute this value separately for each stock. The script contains two derived table queries. MONTH Evaluating the Trading Rules Three SSRS reports are provided as a model for evaluating the trading rules with the six stocks examined in this tip. The summary results do not include trading costs because commissions per trade can vary or even be zero in some cases depending on the broker and security in which you invest. However, if you plan on using a particular broker with a standard fee per trade, data may want to factor in a broker commission. The first report mining a top line report comparing the three trading strategies for gain versus loss on a per share basis and a lot size basis. As the screen shot below shows, all three trading strategies generated gains over the month evaluation period for each of the stocks and overall. The trading strategy based on the day moving average versus the day moving average generated the smallest gains by a substantial margin. We show why strategies outcome is obtained in the next couple of reports. As you can see, there are 8 columns in the body of the top-line report. The query for the report starts with a SELECT statement that references two main derived table queries -- each of which has its own sub-queries. The listing below just shows the outer most query and references to the main derived table queries so you are not distracted by details from getting the big picture about how the report compiles data. If you wish, you can examine the complete query for the top line comparison of trading strategies report in the SSRS project available for download with this tip. SYMBOL The next screen shot presents another report with detailed results for the monthly buy-and-hold trading rule. You can use this report to examine results for any strategies the six stocks in the data set for this tip. The screen shot below shows results for the stock with the crus symbol. This second report displays monthly trading gain-loss outcomes on a per share basis and a lot size basis for the symbol entered into the SYMBOL selection box. Months with a loss show their outcome in red; otherwise, the per share and per lot size outcome shows in green. Furthermore, the stock loses value in 4 of the first 5 months and an additional span of three consecutive months June through August that were analyzed. The following query shows the code for reporting monthly buy-and-hold trade outcomes. The outermost SELECT statement references two derived table queries that are inner joined. Also, the SYMBOL parameter in the code's last line allows the report user to specify a stock symbol for which to show results. The result of the rule is that trades are not made in 10 months for which the every-month rule makes a trading. Both this report and the preceding one are for the stock with a crus symbol. There are 10 months in which the trading rule based on moving averages skips a trade for a month relative to the every-month trading rule. The trading rule based on moving averages skipped 5 losses and 1 no-change outcome. These good results from the moving average rule are counter-balanced, in part, by missed gains in 4 months. The shrinkage in gains was even greater for the remaining 5 stocks. My take-away from these results is that the moving average rule is not sufficiently accurate about discovering when a month is likely to result in a loss versus a gain. More research to discover better trading rules for avoiding losses while not missing gains might result in selective trading strategies that are better than buy-and-hold or buy-and-sell every month. The following script shows the query for the report on the mining average trading rule. This script differs from the script for the every-month rule in that it shows day and day moving averages and that it only presents trading outcomes for months where the initial day moving average is greater than the initial day moving average. All comments are reviewed, so stay on subject or we may delete your comment. Copyright c Edgewood Solutions, LLC All rights reserved Some names and products listed are the registered trademarks of their respective owners. Using SQL Server Data Analysis for Stock Trading Strategies googletag. Rick Dobson is a Microsoft Certified Technical Specialist and well accomplished SQL Server and Access author. View all my tips. Related Resources Using SQL Server Data Analysis for Stock Trading S More Database Developer Tips More SQL Server Solutions. I enjoyed writing the tip. Tuesday, January mining, - 5: Learning DBAs Developers BI Professionals Careers Q and A Today's Tip. Resources Tutorials Webcasts Whitepapers Tools. Strategies Tip Categories Search By TipID Authors. Pictures Contribute Event Calendar User Groups Author of the Year. More Trading Join About Copyright Privacy Disclaimer Feedback Advertise. Wednesday, February 01, - 7:

Build Alpha - Data Mining Bias and P-Hacking

Build Alpha - Data Mining Bias and P-Hacking trading strategies data mining

3 thoughts on “Trading strategies data mining”

  1. фельдкурат Кац says:

    Both the Tudor and Stuart courts used their dynastical brand to improve their individual image but this does somewhat dilute the importance of a collective representation.

  2. Akkerman says:

    Public Records shows all completed drawings going back five years.

  3. AKG says:

    January 27, 2002 - Triggs and Heintz Lead Ole Miss At Arkansas State Invitational.

Leave a Reply

Your email address will not be published. Required fields are marked *

inserted by FC2 system