Thursday, June 21, 2012

Subtracting URL strings

Using the forwarding script discussed in our previous article on moving a blog to a new domain has some shortcomings. Here we look at improving it using JavaScript (if you’re in a hurry, jump to the very end).

20120621-001-screencapWe are considering the first script, as the second is buggy. That script sends the visitor on to the new blog with the URL:

http://new.doma.in/blogger/?q=http://old.doma.in/2012/06/somepage.html

This is constructed in the template thusly:

window.location.href=http://new.doma.in/blogger/?q=<$BlogItemPermalinkURL$>

It is still possible to use this as it is for the most important pages by creating an alias – this is a new Blogger feature. Yet this is impractical for a large blog.

What we need to do is get rid of that /?q= and send the visitor to the TRUE new page, assuming the same blog structure:

http://new.doma.in/2012/06/somepage.html

To accomplish this, we’d have to somehow get the “root URL”.

There's a rather large list of the Blogger variables we have at our disposal, courtesy of singpolyma.
<$BlogPageTitle$> <data:blog.pageTitle/>
<$BlogMetaData$> <b:include data=’blog’ name=’all-head-content’/>
<style type="text/css"> <b:skin><![CDATA[
</style> ]]></b:skin>
<$BlogURL$> <data:blog.homepageUrl/>
<$BlogDescription$>  
<Blogger> <b:section class=’posts’ id=’posts’ showaddelement=’yes’ growth=’vertical’>
<b:widget id=’PostWidget’ locked=’false’ title=’Posts’ type=’Blog’>
<b:includable id=’main’>
<b:loop values=’data:posts’ var=’post’>
</Blogger> </b:loop>
</b:includable>
</b:widget>
</b:section>
<$BlogItemNumber$> <data:post.id;
<BlogDateHeader> <b:if cond=’data:post.dateHeader’>
</BlogDateHeader> </b:if>
<$BlogDateHeaderDate$> <data:post.dateHeader/>
<$BlogItemPermalinkUrl$> <data:post.url/>
<BlogItemTitle> <b:if cond=’data:post.title’>
</BlogItemTitle> </b:if>
<$BlogItemTitle$> <data:post.title/>
<$BlogItemBody$> <data:post.body/>
<$BlogItemAuthorURL$> <data:blog.homepageUrl/>
<$BlogItemAuthor$> <data:post.author/>
<$BlogItemDateTime$> <data:post.timestamp/>
<BlogItemCommentsEnabled> <b:if cond=’data:post.allowComments’>
</BlogItemCommentsEnabled> </b:if>
<$BlogItemCommentCount$> <data:post.numComments/>
<$BlogItemControl$> <span class=’control’>
<b:if cond=’data:post.emailPostUrl’>
<span class=’item-action’>
<a expr:href=’data:post.emailPostUrl’ title=’Email Post’>
<span class=’email-post-icon’> </span>
</a>
</span>
</b:if>
<b:include data=’post’ name=’postQuickEdit’/>
</span>
<$BlogEncoding$> <data:blog.encoding/>
<$BlogTitle$> <data:blog.title/>
<$BlogItemAuthorNickname$> <data:post.author/>
<$BlogID$>  
<$BlogItemUrl$> <data:post.link/>
<ItemPage> <b:if cond=’data:blog.pageType == "item"’>
</ItemPage> </b:if>
<MainOrArchivePage> <b:if cond=’data:blog.pageType != "item"’>
</MainOrArchivePage> </b:if>
<MainPage> <b:if cond=’data:blog.pageType == "main"’>
</MainPage> </b:if>
<ArchivePage> <b:if cond=’data:blog.pageType == "archive"’>
</ArchivePage> </b:if>
<$BlogItemCreate$> <a expr:href=’data:post.addCommentUrl’>Post a Comment</a>
<BlogItemComments> <b:loop values=’data:post.comments’ var=’comment’>
</BlogItemComments> </b:loop>
<$BlogCommentNumber$> <data:comment.id;
<$BlogCommentDateTime$> <data:comment.timestamp/>
<$BlogCommentAuthor$> <address style="display:inline;font-style:normal;" class="author vcard">
<b:if cond=’data:comment.authorUrl != ""’>
<a class="url fn" expr:href=’data:comment.authorUrl’><data:comment.author/></a>
<b:else/>
<span class="fn"><data:comment.author/></span>
</b:if>
</address>
<$BlogCommentBody$> <data:comment.body/>
<$BlogCommentDeleteIcon$> <b:include data=’comment’ name=’commentDeleteIcon’/>
<$BlogCommentPermalinkURL$> #c<data:comment.id;

Of interest are the following variables:

  1. <$BlogURL$>  - homepage URL, aka <data:blog.homepageUrl/>
  2. <$BlogItemPermalinkUrl$> – regular URL aka <data:post.url/>

We’ll have to subtract the first from the second and then append it to the newdoma.in.

javascript

Luckily, this has long been Q&A’d on several forums. These are the solutions that caught my eye.

Removing just the last character (Shi Chuan, highhub)

var str = ‘hello world!’;
var newStr = str.substring(0, str.length-1);
var newStr = newStr.substring(1);
alert(newStr);

As above, simpler (peter):

var newStr = str.slice(0, -1)

Get last character (Amir 1):

var sPath = window.location.href;
sPath.slice(sPath.length-1, sPath.length)

Amir 2:

var sPath = window.location.href;
sPath.substr(sPath.length-1,1)

Siva:

var str = "Somestring";
var lastStrChar = str[str.length - 1];

Ammar, remove first and last:

var str = ‘hello world!’;
var newStr = str.substring(0, str.length-1);
var newStr = newStr.substring(1);
alert(newStr);

By now, one should have enough info to write the code even without knowledge of JavaScript. Yet, there are even more specific examples.

In this example, someone is asking for the code for string subtraction using either Java or JavaScript.

If you are using Java or JavaScript, is there a good way to do something like a String subtraction so that given two strings:

org.company.project.component
org.company.project.component.sub_component

you just get

sub_component

I know that I could just write code to walk the string comparing characters, but I was hoping there was a way to do it in really compact way.


Daniel:

public static String sub(String a, String b) {
    if (b.startsWith(a)) {
        return b.subString(a.length());
    }

    if (b.endsWith(a)) {
        return b.subString(0, b.length() - a.length());
    }

    return "";
}

Chris:


String result = "org.company.project.component.sub_component".replace("org.company.project.component","")

roenving:


<script type="text/javascript">
var a = "org.company.project.component.diff";
var b = "org.company.project.component.sub_component";
var i = 0;
while(a.charAt(i) == b.charAt(i)){
  i++;
}
alert(b.substring(i));
</script>

C# is similar to JavaScript, but different (ch9).

This next example seems to be asking for arbitrary string subtraction.

String 1: "Hello. I love C#"
String 2: "love"
Result: "Hello. I   C#"


Sven Groot:


function string_subtract(str1, str2)
{
  var pos = str1.indexOf(str2);
  if( pos == -1 )
    return str1;
  var result = str1.substr(0, pos) + str1.substr(pos + str2.length);
  return result;
}
alert(string_subtract("Hello. I love C#", "love"));


So, what is the final template?

realdeal


Suppose the URL is in the variable string. Then the solution is:



var str=”http://old.doma.in/somepage.html”;

document.write(str.replace(/^(?:\/\/|[^\/]+)*\//, "");
in: http://old.doma.in/somepage.html –>> Out: /somepage.html

It is also possible to use a longer function when more advanced URL processing is needed:


function getUrlParts(url) {
    var a = document.createElement('a');
    a.href = url;

    return {
        href: a.href,
        host: a.host,
        hostname: a.hostname,
        port: a.port,
        pathname: a.pathname,
        protocol: a.protocol,
        hash: a.hash,
        search: a.search
    };
}

To access the pathname, you would use getURLParts(yourUrl).pathname When redirecting,



window.location.replace(...) will best simulate an HTTP redirect.

It is better than using window.location.href =, because replace() does not put the originating page in the session history, meaning the user won't get stuck in a never-ending back-button fiasco. If you want to simulate someone clicking on a link, use location.href. If you want to simulate an HTTP redirect, use location.replace.

For example:

// similar behavior as an HTTP redirect
window.location.replace(http://stackoverflow.com);

// similar behavior as clicking on a link
window.location.href = http://stackoverflow.com;

Alrighty then, on to subtracting the strings.


subtracting


Turns out that the “subtracting approach” is difficult if not impossible (I could not figure it out). It is simpler and more straight-forward to use the original approach, adapted to a Blogger 2 Blogger transfer.


The downside to focusing on JavaScript is that it leaves out in the dark browsers that don’t have it or have it turned off. This is actually the reason for that ugly “/blogger/?q=old.url” hack. Creating that 301 permanent redirect could still be achieved off a server with PHP with some small modifications to the original template. Here’s the PHP script (downloadable below):


<?php
/*
Template Name: blogger
*/

global $wpdb;
$old_url = $_GET['q'];

if ($old_url != "") {
$permalink = explode("blogspot.com", $old_url);

$q = "SELECT guid FROM $wpdb->posts LEFT JOIN $wpdb->postmeta ".
"ON ($wpdb->posts.ID = $wpdb->postmeta.post_id) WHERE ".
"$wpdb->postmeta.meta_key='blogger_permalink' AND ".
"$wpdb->postmeta.meta_value='$permalink[1]'";

$new_url = $wpdb->get_var($q)? $wpdb->get_var($q) : "/";

header ("HTTP/1.1 301 Moved Permanently");
header("Location: $new_url");
}
?>

Since we’ll be using this script for existing domains, we need to replace the “blogspot.com” string delimiter above with the actual domain or its ending, such as “old.doma.in” or “doma.in” or even “.in”. What remains then to be figured out is whether the template above, normally used within a WordPress installation as a “static page”, makes any calls or passes any variables to Wordpress, or can be used directly with on external domain on Blogger.


Eliminand codul Wordpress, scriptul devine



<?php

$newdo='lzc';

$olddo='ozc';

$old_url = $_GET['q'];

$permalink = explode($olddo, $old_url);

$new_addr="http://" .$newdo .$uri.$permalink[1];

header ("HTTP/1.1 301 Moved Permanently");

header("Location: $new_addr");

?>


Most of the article is unnecessary, it only documents my path, but some of the stuff above may be useful later :)


Sources / More info: singpolyma, ecma, qsm, ch9, stackoverflow, hhb, so-relative, so-redirect, blogger.txt | template: tod, gh, sm