tag:blogger.com,1999:blog-361799842023-11-15T08:03:48.476-08:00Malcook Gedankenmalcookhttp://www.blogger.com/profile/13427148069214826368noreply@blogger.comBlogger10125tag:blogger.com,1999:blog-36179984.post-82312331381186393262012-06-26T13:27:00.003-07:002012-06-26T13:59:14.020-07:00Get more aggressive control over recursive deletion with Emacs diredUsing the Emacs directory editor, dired, as much as I do, I've wanted a quick way to force recursive deletion of a selection of (sub(sub)) directories without having to confirm each deletion.
Here's a snippet for you init.el that builds on Emacs' existing variable `dired-recursive-deletes`.
<script src="https://gist.github.com/2998639.js?file=defadvice dired-do-flagged-delete"></script>malcookhttp://www.blogger.com/profile/13427148069214826368noreply@blogger.com0tag:blogger.com,1999:blog-36179984.post-33299396305698519992011-10-04T10:55:00.001-07:002011-10-04T10:57:07.260-07:00HOWTO: turn off orgmode's evaluation of source blocks during export<p>I find that I generally do NOT want org mode to execute source code<br />blocks as a side effect of exporting, which is the default behavior.<br />Rather, I prefer to explicitly execute the code either interactively<br />block by block (such as when I'm developing an analysis), or using<br />org-babel-execute-(buffer|subtree) commands. So, I typically set<br />`org-export-babel-evaluate` to nil on a buffer basis with the<br />following as 1st line in buffer:</p><p><span class="Apple-style-span" style="font-family: monospace; font-size: 13px; white-space: pre; "># -*- org-export-babel-evaluate: nil; -*-</span></p>malcookhttp://www.blogger.com/profile/13427148069214826368noreply@blogger.com0tag:blogger.com,1999:blog-36179984.post-84303321941191676652011-06-09T07:42:00.001-07:002011-06-09T11:16:53.846-07:00Tabulating gaps in BAM filesIf you're analyzing RNA-SEQ data and care about quantifying splicing, read on...<br />Here is an approach to tabulating the counts of gaps present in a bam file.<br />It is implemented in R using bioconductor packages<br />It takes about a minute to tabulate 33 million reads on my hardware.<div>It produces a data.frame for further downstream analysis of your choosing, and, optionally, a corresponding "junctions.bed" file (i.e. as is produced by <a href="http://tophat.cbcb.umd.edu/">Tophat</a> and can be <a href="http://www.broadinstitute.org/software/igv/wiki/index.php/Viewing_Splice_Junctions">visualized in IGV</a>).<br />Here's an example of using it and inspecting its output:<div><div><pre></pre></div><div><div>> btg <- bamTabulateGaps('t/t1.sam.sorted.bam')</div><div>> names(btg)</div><div>[1] "bedPath" "tabulatedGaps"</div><div>> btg$bedPath</div><div>[1] "t/t1.sam.sorted.junctions.bed"</div><div>> system(paste('head ', btg$bedPath))</div><div>track name=t1 graphType=junctions</div><div>2L<span class="Apple-tab-span" style="white-space:pre"> </span>11343<span class="Apple-tab-span" style="white-space:pre"> </span>11410<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre"> </span>25</div><div>2L<span class="Apple-tab-span" style="white-space:pre"> </span>11517<span class="Apple-tab-span" style="white-space:pre"> </span>11779<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre"> </span>51</div><div>2L<span class="Apple-tab-span" style="white-space:pre"> </span>12220<span class="Apple-tab-span" style="white-space:pre"> </span>12286<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre"> </span>121</div><div>2L<span class="Apple-tab-span" style="white-space:pre"> </span>12927<span class="Apple-tab-span" style="white-space:pre"> </span>13520<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre"> </span>109</div><div>2L<span class="Apple-tab-span" style="white-space:pre"> </span>13624<span class="Apple-tab-span" style="white-space:pre"> </span>13683<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre"> </span>109</div><div>2L<span class="Apple-tab-span" style="white-space:pre"> </span>14873<span class="Apple-tab-span" style="white-space:pre"> </span>14933<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre"> </span>114</div><div>2L<span class="Apple-tab-span" style="white-space:pre"> </span>15710<span class="Apple-tab-span" style="white-space:pre"> </span>17053<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre"> </span>38</div><div>2L<span class="Apple-tab-span" style="white-space:pre"> </span>17211<span class="Apple-tab-span" style="white-space:pre"> </span>18026<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre"> </span>4</div><div>2L<span class="Apple-tab-span" style="white-space:pre"> </span>17211<span class="Apple-tab-span" style="white-space:pre"> </span>18261<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre"> </span>5</div><div>> head(btg$tabulatedGaps)</div><div> space start end score</div><div>1 2L 11344 11410 25</div><div>2 2L 11518 11779 51</div><div>3 2L 12221 12286 121</div><div>4 2L 12928 13520 109</div><div>5 2L 13625 13683 109</div><div>6 2L 14874 14933 114</div></div><div></div><div><br /></div><div>Enjoy the code:</div><div><script src="https://gist.github.com/1016957.js"> </script></div></div></div>malcookhttp://www.blogger.com/profile/13427148069214826368noreply@blogger.com0tag:blogger.com,1999:blog-36179984.post-17154924952288835642010-04-07T14:21:00.000-07:002010-05-26T10:26:55.989-07:00Control IGV from Excel or other Macintosh/Windows Application using Applescript/VBA<blockquote></blockquote><a href="http://www.broad.mit.edu/igv/">IGV</a> is great for browsing genomes. Here is some quick Applescript|VBA that you can install into an Excel workbook (or other Mac|Windows app) to allow navigating IGV to the locus in the current cell. With this approach, I can deliver Excel workbooks containing RNASeq expression values, and the recipient can sort/filter or otherwise slice and dice them, and easily navigate to the genes that survive the dicing....<br /><br />I'm likely to make some improvements on this and build an installer for it, but, I put it out here/now in case someone<br /> * has already done this better<br /> * has suggestions for improvement to take into consideration<br /> * want's to use it as is<br /><pre>-- NAME: IGVGoTo.scpt<br />-- PURPOSE: Applescript to cause IGV (http://www.broad.mit.edu/igv/) to 'goto' the locus appearing in the Excel's active cell.<br />-- AUTHOR: malcolm_cook@stowers.org<br />-- REQUIRES:<br />-- * installing XNet as obtained from http://lestang.org/spip.php?article20<br />-- * the XOSL.Framework gets installed into Users/<you>/Library/Frameworks<br />-- * configuring IGV > View > Preferences > advanced > Enable Port<br />-- it must be checked ON and set to 60151 (the default)<br />-- INSTALLATION:<br />-- * Save this script as Users/<you>/Documents/Microsoft User Data/Excel Script Menu Items/IGVGoTo.scpt<br />-- * OPTIONAL: assign a keyboard shortcut:<br />-- * Open the System Preferences, click on "Keyboard & Mouse" and select the "Keyboard Shortcuts" tab.<br />-- * Click on the plus sign beneath the list view. In the dialog sheet that pops up, select Microsoft Excel from the Application menu (install it if needed - I had to)<br />-- * type "IGVGoTo" into the "Menu Title" field, tab to the Keyboard Shortcut field, and type in the shortcut you would like to use (I prefer Command-Option-I) and click the Add button<br />-- RUNNING IT:<br />-- * IGVGoTo will appear in Microsoft Excels's script menu (rightmost menu)<br />-- * navigate to a cell whose value is locus for the currently displayed IGV genome and run the script.<br />-- The active cells value will be used as the `locus` target of a `goto` command<br />-- * c.f. http://www.broadinstitute.org/igv/PortCommands for valid targets<br />-- TODO<br />-- * bring IGV to the front (optionally?)<br />-- * dispatch on the frontmost appliction - get the locus appropriate to the application - implement for<br />-- MSWord and FileMakerPro<br />-- * trap errors and provide diagnostics (i.e. XNet not installed, port not openable, IGV not running, bad response from IGV, etc)<br /><br />tell application "Microsoft Excel"<br /> my IGVGoTo("localhost", 60151, value of active cell as string) end tell<br /><br />on IGVGoTo(theHost, thePort, theLocus)<br /> tell application "XNet"<br /> launch<br /> set s to make new socket with properties {host:theHost, port:thePort}<br /> socket open s<br /> socket write s data "goto " & theLocus<br /> delete s<br /> end tell<br />end IGVGoTo<br /></pre><br />Here's the VBA for the Window's folk:<br /><pre><br />Public Sub IGV_Goto()<br />'PURPOSE: Excel Macro to cause IGV (http://www.broad.mit.edu/igv/) to 'goto' the locus appearing in the Excel's active cell.<br />' AUTHOR: malcolm_cook@stowers.org<br />' REQUIRES:<br />' * configuring IGV > View > Preferences > advanced > Enable Port<br />' it must be checked ON and set to 60151 (the default)<br />' * installing a free winsock implementation from "http://www.ostrosoft.com/oswinsck/oswinsck_vba.asp"<br />' INSTALLATION:<br />' * create a new VBA Macro out of this script in the desired Excel workbook - like this:<br />' alt-F11 to open VBA<br />' click on 'This workbook'<br />' paste in this script<br />' create a reference to the winsock library, by checking "OstroSoft Winsock Component" under "Tools > references".<br />' quit VBA with alt-q<br />' Alt-F8 (back in Excel) will list your macros - click option button to assign to a key (I use Ctrl-Shift-I)<br /><br /><br /> <br /> On Error GoTo Err<br /> Dim cmd As String<br /> Dim locus As String<br /> Dim r As Excel.Range<br /> Set r = Excel.ActiveCell.EntireRow '.CurrentArray.RowDifferences..Rows<br /> locus = Excel.ActiveCell.Value<br /> cmd = "goto " & locus<br /> Dim wso As New OSWINSCK.TCP ' free winsock implementation from "http://www.ostrosoft.com/oswinsck/oswinsck_vba.asp"<br /> wso.Connect "localhost", 60151<br /> wso.SendData cmd & vbCr<br /> wso.Disconnect<br />ok:<br /> Exit Sub<br />Err:<br /> MsgBox Err.Description<br />End Sub<br /></pre>malcookhttp://www.blogger.com/profile/13427148069214826368noreply@blogger.com0tag:blogger.com,1999:blog-36179984.post-28845838074006264932008-10-28T10:13:00.000-07:002008-10-28T10:16:41.720-07:00eliminating redundant indexes from your mysql databaseTo find redundant (unneccesary) indexes in your mysql database, try the information schema views in<br /><a href="http://forge.mysql.com/tools/tool.php?id=45">Roland Bouman's Redundant Index Finder</a>. To delete them, try the procedure I_S_REDUNDANT_INDEXES_DROP which I just posted to the Community Feedback section.malcookhttp://www.blogger.com/profile/13427148069214826368noreply@blogger.com0tag:blogger.com,1999:blog-36179984.post-59497013078589638632007-09-11T09:08:00.000-07:002007-09-11T09:13:44.987-07:00HOWTO: search unix's $LD_LIBRARY_PATHIn figuring out how to search $LD_LIBRARY_PATH for (hopefully not multiply) installed versions of libraries, I found that $LD_LIBRARY_PATH, being colon-delimited, was not suitable argument for the unix `find` command.<br /><br />Thus, an 'on-the-fly' edit to replace the colons with spaces is needed, consing up a suitable arg to find.<br /><br />I had never used the string substitution modifer to bash's parameter expansion. Now I do.<br /><br />For example:<br /><br />>find ${LD_LIBRARY_PATH//:/ } -maxdepth 1 -name libreadline.* -print<br />/usr/lib/libreadline.so.4.3<br />/usr/lib/libreadline.so<br />/usr/lib/libreadline.a<br />/usr/lib/libreadline.so.4<br /><br /><br />Which prompts me to write the following bash function:<br /><br />function llpfind {<br /> # PURPOSE: search $LD_LIBRARY_PATH <br /> # EXAMPLE: llpfind -name libreadline.*<br /> find ${LD_LIBRARY_PATH//:/ } -maxdepth 1 $@ -print<br />}malcookhttp://www.blogger.com/profile/13427148069214826368noreply@blogger.com0tag:blogger.com,1999:blog-36179984.post-44574266289270181112007-04-11T08:07:00.000-07:002007-09-11T09:20:57.754-07:00HOWTO: inferring SO compliant features for splice_donor_site and splice_acceptor_site given a gene modelCorrectly inferring SO compliant features for splice_donor_site and splice_acceptor_site given a gene model can be tricky.<br /><br />I hope the following simplified example is useful to understanding the issue.<br /><br /><br />EXAMPLE<br />============================<br /><br />Given this simplified gene model containing two exon each being 3bp<br />long:<br /><br />123456789<br />EEEIIIEEE<br />>>>--->>><br /><br />and given these SO definitions:<br /> <br /> splice_donor_site: The junction between the 3 prime end of an exon and the following intron.<br /> splice_acceptor_site: The junction between the 3 prime end of an intron and the following exon.<br /><br />...we should encode the gene as: <br /> exon(1,3,+) <br /> splice_donor_site(3,3,+)<br /> intron(4,6,+) <br /> splice_acceptor_site(6,6,+)<br /> exon(7,9,+)<br /><br />HOWEVER, if the gene codes the other way, viz.<br /><br />123456789<br />EEEIIIEEE<br /><<<---<<<<br /><br />...we should encode it as: <br /> exon(7,9,-)<br /> splice_donor_site(6,6,-)<br /> intron(4,6,-) <br /> splice_acceptor_site(3,3,-)<br /> exon(1,3,-) <br /><br />Note that the coordinates of the exon and intron are the same in both encodings, only the strand is different; AND, the coordinates of the<br />splice sites are also the same between encodings, due to understanding <a href="http://www.sequenceontology.org/gff3.shtml">GFF3</a>: "For zero-length features, such as insertion sites, start equals end and the implied site is to the right of the indicated base in the direction of the landmark"<br /><br /> "to the right of the indicated base in the direction of the landmark." as "1 plus the indicated base, in interbase coordinates"<br /><br />It is this understanding that I hope to have clarified by this example, demonstrating in particular that the splice sites should NOT be encoded in the second model as:<br /><br /> splice_donor_site(7,7,-)<br /> splice_acceptor_site(4,4,+)malcookhttp://www.blogger.com/profile/13427148069214826368noreply@blogger.com0tag:blogger.com,1999:blog-36179984.post-66568697844717078212007-02-22T08:28:00.000-08:002007-02-22T22:31:33.816-08:00HOWTO: Navigate to next/previous conditionally formatted recordI offer this approach to extending Access's capabilities and welcome and criticism of the utility of the tool, the approach to providing it and the implementation.<br /><br /><pre><br />Public Function cmdFindConditionallyFormatted(Optional vSearchDirection As Variant)<br /> 'PURPOSE: allow the database user to easily navigate to the next or previous<br /> ' record that passes the conditions defined using Access' native conditional formatting<br /> ' Find the next row in the current form whose which satisfied the conditional format criteria<br /> ' of the current control.<br /> ' If no such row, beep and stay put.<br /> ' Search according to vSearchDirection, where 'acUp' means 'after' the position of Access's 'Current' record in<br /> ' the forms recordset.<br /> ' vSearchDirection if provided should be an AcSearchDirection. If not provided, it is taken from the Parameter<br /> ' of the current CommandBarControl (assuming there is one).<br /> 'USAGE: call from a commandbar or macro. I bind this same function to two custom command bar items, each with a different<br />.Parameter value ("acUp" and "acDown")<br /> 'TODO:<br /> ' * support multiple format conditions (but, what is semantics of this? Should they be 'ORed' together?<br /><br /> On Error GoTo HandleErr<br /><br /> Dim SearchDirection As AcSearchDirection<br /> Dim ExtendSelection As Boolean<br /> Dim bm As Variant 'bookmark<br /> Dim ac As Access.Control<br /> Dim rs As DAO.Recordset<br /> Dim fct As Access.AcFormatConditionType<br /> Dim fco As Access.AcFormatConditionOperator<br /> Dim fc As Access.FormatCondition<br /> Dim fcs As Access.FormatConditions<br /> Dim crit As String<br /><br /> If Not IsMissing(vSearchDirection) Then<br /> SearchDirection = vSearchDirection<br /> ElseIf CommandBars.ActionControl Is Nothing Then Err.Raise 666, , "cannot determine vSearchDirection in call to<br />cmdFindConditionallyFormatted"<br /> Else<br /> Select Case CommandBars.ActionControl.Parameter<br /> Case "acDown"<br /> SearchDirection = acDown<br /> Case "acUp"<br /> SearchDirection = acUp<br /> Case Else<br /> Err.Raise 666, , "invalid name of AcSearchDirection in CommandBars.ActionControl.Parameter:" &<br />CommandBars.ActionControl.Parameter<br /> End Select<br /> End If<br /><br /> Set ac = Access.Screen.ActiveControl<br /><br /> Set fcs = ac.FormatConditions<br /> If fcs.Count = 0 Then Err.Raise 777, , ac.Controls(0).Caption & " has no Format Conditions by which to navigate. "<br /> If fcs.Count <> 1 Then Err.Raise 777, , ac.Controls(0).Caption & " has multiple Format Conditions. Only a single format<br />condition placed on the current control for this to work"<br /> Set fc = fcs(0)<br /> fct = fc.type<br /> Set rs = ac.Parent.Recordset<br /> If fc.Enabled Then<br /> Select Case fct<br /> Case acExpression<br /> crit = fc.Expression1<br /> Case acFieldValue<br /> Select Case fc.Operator<br /> Case acEqual<br /> crit = ac.ControlSource & " = " & fc.Expression1<br /> Case acNotEqual<br /> crit = ac.ControlSource & " <> " & fc.Expression1<br /> Case acLessThan<br /> crit = ac.ControlSource & " < " & fc.Expression1<br /> Case acLessThanOrEqual<br /> crit = ac.ControlSource & " <= " & fc.Expression1<br /> Case acGreaterThan<br /> crit = ac.ControlSource & " > " & fc.Expression1<br /> Case acGreaterThanOrEqual<br /> crit = ac.ControlSource & " >= " & fc.Expression1<br /> Case acBetween<br /> crit = ac.ControlSource & " >= " & fc.Expression1 & " AND " & ac.ControlSource & " <= " & fc.Expression2<br /> Case acNotBetween<br /> crit = ac.ControlSource & " < " & fc.Expression1 & " OR " & ac.ControlSource & " > " & fc.Expression2<br /> Case Else<br /> Err.Raise 666, , "unrecognized value for AcFormatConditionOperator: " & fc.Operator<br /> End Select<br /> Case acFieldHasFocus<br /> Case Else<br /> Err.Raise 666, , "unrecognized AcFormatConditionType: " & fct<br /> End Select<br /><br /> bm = rs.Bookmark ' to which we will return if no record found.<br /> Select Case SearchDirection<br /> Case acDown<br /> rs.FindNext (crit)<br /> Case acUp<br /> rs.FindPrevious (crit)<br /> Case Else<br /> Err.Raise 666, , "invalid AcSearchDirection: " & SearchDirection<br /> End Select<br /> If rs.NoMatch Then<br /> Beep<br /> rs.Bookmark = bm<br /> End If<br /> End If<br /><br />ExitHere:<br /> Exit Function<br /><br />HandleErr:<br /> Select Case Err.Number<br /> Case 666 ' program logic error - shouldn't happen<br /> MsgBox Err.Description<br /> Stop<br /> Case 777 'reportable condition - no problem, just report it to user<br /> MsgBox Err.Description<br /> Case Else 'unanticipated error. Write a new case!<br /> MsgBox Err.Number & ":" & Err.Description<br /> Debug.Assert False<br /> End Select<br /> Resume ExitHere<br /><br />End Function <br /></pre>malcookhttp://www.blogger.com/profile/13427148069214826368noreply@blogger.com0tag:blogger.com,1999:blog-36179984.post-1161724030046829512006-10-24T14:06:00.000-07:002006-10-24T14:18:01.996-07:00My Custom Google Search Box : bioinformaticshome page is at http://www.google.com/coop/cse?cx=003287490790234136152%3Alktwxienlxc<br /><br /><!-- Google CSE Search Box Begins --><br /><form id="searchbox_003287490790234136152:lktwxienlxc" action="http://google.com/cse"><br /> <input type="hidden" name="cx" value="003287490790234136152:lktwxienlxc" /><br /> <input name="q" type="text" size="40" /><br /> <input type="submit" name="sa" value="Search" /><br /> <input type="hidden" name="cof" value="FORID:0" /><br /></form><br /><script type="text/javascript" src="http://google.com/coop/cse/brand?form=searchbox_003287490790234136152%3Alktwxienlxc"></script><br /><!-- Google CSE Search Box Ends -->malcookhttp://www.blogger.com/profile/13427148069214826368noreply@blogger.com0tag:blogger.com,1999:blog-36179984.post-1161094120912854582006-10-17T07:06:00.000-07:002006-10-17T07:08:40.920-07:00What this?Joining the personal blogging revolution.<br />Might this work for me?<br />Topics:<br /> bioinformatics<br /> visual basic<br /> perl<br /> ms accessmalcookhttp://www.blogger.com/profile/13427148069214826368noreply@blogger.com0