黎明时节2

  前晚在凌晨两点打了《黎明时节》,今天就被我的中文老师批评说文章的逻辑不通、语言有时用的比较生硬。我今天重读了我的文章,同意我老师说的话,所以今天希望重写这篇文章,让你们更清楚地理解我为什么会说这个词。
  生命中的每一个季节都带来不同的特征。如果我们学会看清生命的季节,我们就会知道今天该如何行为。在2009年,我做了很多遗憾和后悔的事,甚至越了我自己的境界。2010年,我觉得我就在狂野里绕来绕去,什么进展都没有。那两年,生活在黑暗中,我心里不知那黑暗的时期会不会过去?我很感恩,回头望,那就是黎明前的黑暗。我熬过了那段时间,终于开始走向有阳光的日子。
  我相信当下是我生命中的黎明时节,而接下去的几年会越来越光亮。在这几个月的时间,我很感恩神通过我领袖把我带向今天。过了这么多年,我觉得我终于学会珍惜生命,学会珍惜每一天。在学校学习计划时间是一件事,但领悟到时间的价值是另一件事。领悟到我生命的短暂性和生命的价值改变了我每天起床的思维。我相信这让我在很多方面有了突破。
  我今晚就先把话说到这里,改天再继续写这个题目。

黎明时节

近日我觉得我的心境有点不知道该怎么形容。我很愉快能通过一些网站认识了很多新朋友,但同时明白在网上认识真实的朋友没那么简单。因为我本性是偏内向,我主动联系别人的时候会比较累。

我一直以来知道明天总是未知数,但觉得这几个星期我对未来的心态有所改变。或许是跟着时间的改变而变了;或许是长大了,对世界的看法也改变了。

不同的时节带来不同的特征。我相信我目前活在黎明时节里。即使心里有时会忐忑,有时会为明天操心,我相信我当前活在最好的季节里。这黎明时节会有多长,接下去的早晨、中午、下午、晚上有多长,我不知道。但我知道的是生命是短暂,是宝贵的。今天就是我唯一可以选择怎么过的日子。

Effective Language Exchange

Though generally an introvert, I enjoy meeting people from a wide spectrum of society and understand that relationship is an intricate part of life on earth. During my time at ECNU I had friends who had language partners but it mostly ended up being more of a one way relationship, which caused me to become skeptical of such arrangements. However I now believe that effective language exchange partnerships are possible, if both parties are serious about it and start off with common goals in mind. In today’s post I will present an overall picture of what I do from finding language partners to meeting up with them.

  1. Finding People
    How you go about this will largely depend on the place that you are in. In this day and age a number of online communities have sprung up catering to this need. Examples include Lang-8 and Livemocha. I helped a friend on Lang-8 translate her resume, which gave me the opportunity to come in contact with business vocabulary that will be useful when I look for work in China next time. Livemocha looks like it has quite a vibrant community as well, which I just joined last night, and will be able to share my results in the future. While I have met up with over 50 internet friends over many years in real life, I cannot over stress the importance of exercising caution if you meet up with someone you know over the internet.

    Many universities and colleges tend to have their own internet forums as well which allow students to post ads.

    Lastly, there are always real world methods such as posting ads in specified places, asking around your friends, etc.

  2. Finding the Right People
    This is where you, as a learner, need to exercise your own discretion to decide who you want to learn from. While I know basic Vietnamese, you probably wouldn’t want to learn it from me. As a partnership, you will want to ensure that you have a sufficient level of proficiency to help your partner as well. It also seems to me that language standards have dropped in places that I have been to. I believe this can be attributed to the increased use of typing over writing, among other factors. This is important to note once you get past the intermediate level.

    Another point to note is to find someone who is equally committed. However with the nature of the internet, there is no straightforward way to find someone who is committed. My suggestion is to just give it a shot, make contact, and you will soon find out how committed that person is.

    Common interests are also very helpful to fueling the friendship and conversations though it is not compulsory.

    Even if you’re too serious about improving your language, thinking through these steps will help you gain a clearer picture of where you are headed. As an upper-intermediate learner of Chinese, these are points that I look for in a language partner, and am not recommending that you use the same criteria.

    • Native mainland Chinese. Though I do love Taiwanese people and their music, my long term goal is to go to China and therefore I do not wish to expend brain juice on using traditional Chinese.
    • Someone who is serious about his own use of language and considers himself a proficient user of Chinese. As I understand, there is a rising number of people in my generation who do not pay attention to the words the use when typing in pinyin, resulting in text that is still understandable. However that is bad for me as a language student, especially if it is material that is new to me. That person should likewise be serious about learning English.
    • Lastly, I am looking for someone with a heart. This is not a business partnership. I have interest in technology (particularly IT), personal growth, finance, music, photography, belief systems, community work, and all things China. If someone has interests that exceed this scope, I am happy to read up on something that I’m not familiar with so that we can use it as material for discussion.
  3. Meeting Up
    While meeting up is not compulsory, I feel that I learn much better when I am with my language partner. I will repeat myself: exercise caution when meeting someone for the first time. Meet in a crowded place, don’t carry too much cash with you, and be vigilant.

    I usually try to bring some material along, whether it be some books, magazines or articles of interest. In fact it could just be some mentally prepared discussion topics, with a list of associated vocabulary. The reason I strongly suggest this is because I have seen too many people repeating the same conversations of weather talk or staring at one another. It also provides a structure which enables the exchange to be more effective and meaningful.

    The other thing I bring is a notepad to jot down new words, phrases, and other interesting (or boring but important) bits of Chinese that I learn. This also aids in later revision and becomes material that your language partner can test you on again at the next session if need be.

    Lastly, bring an expectant attitude with you and come prepared to help your language partner. Remember, perfecting your craft is just one part of life. Enjoy the meeting and learn as much as you can. It is very likely that your language partner will have much more to share with you than just his/her language. If you are able to meet me, I will be happy to topics about cutting edge technology that I read on scientific blogs, simple things I do to make life easier, how the internet works, or even my views on a whole range of topics.

While I started writing this post for my own memory, I hope that you will find it helpful in your quest to learn a foreign language effectively. If you feel that this article has helped you or if you feel that there are other things to take note of, please feel free to post in the comments or to contact me!

Automatic Capitalization

I started off the afternoon familiarizing myself with the jQuery in preparation to code Textsmith. After that, I started to ponder what rules should be used to clean up English (to start off with) text. I thought I was smart to steer clear of the more complicated regions of language that I get myself tangled in at times, and to start with something seemingly simple. Automatic capitalization of text.

Little did I realize that that in itself is no mean feat. Once we get past the easily programmable rules such as:

  • The word I
  • The first letter of each sentence

things get pretty messy. We run into proper nouns, which are a nightmare. If we delve into this realm, is it possible to continue to keep Textsmith lightweight and purely Javascript based? Let’s assume not for starters. My first thought was to dip into Wikipedia, and perform text analysis on that. This is what I thought of:

  1. For each article, take the title of the article
  2. If it is a single word, look for all instances of the word in that article where it is not used at the beginning of a sentence. If it is capitalized, it is a proper noun.
  3. If it is a phrase, it is a proper noun if the phrased is used in the exact same caps throughout the article. An example of a multi-word proper noun on Wikipedia is New York University while a non-proper noun would be data mining.
  4. We may possibly analyze related articles, but I didn’t think about what method of identifying related documents to use.

However there are still problems to be addressed:

  • What about odd words like jQuery?
  • While Wikipedia would provide fairly good coverage, it is definitely not an exhaustive representation of the text available online. The algorithm above would not be able to parse unstructured text too.

Assuming we establish a sufficiently comprehensive database of proper nouns, including names of people, products, etc., the next question is how can we efficiently and correctly identify proper nouns in a body of text? We start to run into problems such as:

  • Context awareness. The word dell is both a common noun and a proper noun, depending on the context. There is some nice (unpublished?) work done by Google as demonstrated on this Google Wave video.
  • There are approximately 3,892,495 articles on Wikipedia as of this writing. How many proper nouns would there be in here, and how many more are we missing?

Thankfully there is a project OpenCalais that works on semantically tagging text. It is free to use via an API but not completely open, which is a little of a bummer from a philosophical point of view.

I didn’t realize that just thinking about capitalization could make my brain hurt so much. There was some other nifty stuff that I read up on the topics of text analysis and how people are deriving information from unstructured text. There’s this interesting project by the AP called Overview that looks like it’s doing a pretty decent job of giving the user a high level overview of a large (hundreds of thousands) corpus of documents, it actually feels tempting to me to deviate from the original goal of Textsmith.

So where do we go from here?

There is an overwhelming amount of hard problems in the world. I’m not saying that to put undue pressure on myself, or to use it as an excuse to give up. I think problems matter differently to different people, and that is a contributing factor to the diversity that we get to enjoy. If everyone was an electrical engineer, or everyone was a writer, the world would be a lot poorer.

For myself, I’m probably going to stick to the lightweight Javascript powered version of Textsmith, and any gnarly science will go into another project. What are your thoughts on what language cleaning/processing should Textsmith do?

Textsmith

A few months back, I was tasked with preparing song sheets for my church’s small group meetings. An easy task indeed, till I realized that most of the lyrics I had came in full caps, erroneous whitespace, and some other oddities. This was not a small feat to clean up, and I wouldn’t be surprised if plenty of other people had spent countless hours cleaning up text. This led to the birth of lyricfixer, a 3 hour Javascript based web page that fixes text in a programmatic manner. While it doesn’t cover all bases, it follows the Pareto principle of solving 80% of the problem with 20% effort.

One day while having a discussion with a friend brainstorming for a new project we could work on together, I thought it would be a good opportunity to further develop lyricfixer, as I didn’t know of anything similar in the market, or at least not something that is free and accessible. I wanted to keep the ease of use, but to extend it to fix up other forms of broken text, such as lack of spacing after periods and commas. Too many a time I’ve seen people learning English as their second language get such things wrong, and I felt that such a tool could at least help them to make their text visually correct, without looking at their language.

And thus the Textsmith project was started on github. Over our next few conversations we had some discussions about what we see Textsmith becoming, and listed some run of the mill text manipulation operations that we hoped Textsmith would be able to help us with (Features). At present Textsmith still has no interface, and while surfing around and looking at related topics of text manipulation, text analysis, data mining, etc., I realized how this could snowball into an unmanageable project.

In line with this I am thinking of defining a qualifying statement of “What you wished your text editor did…”, which means that IP netmask conversions and other such calculators will not go into this. Later on we would ideally be able to support a plugin system though I’m not sure how that will pan out later. By this guiding thought, we may include simple, surface-level analytics, but nothing like sentiment analysis.

Let me know if you have any thoughts on Textsmith either in the comments or via email at lijie@eccentri.cc!

Why Did God Create Us?

I recently had the opportunity to share my faith with a Buddhist friend in Chongqing. While we talked, my friend asked an innocent question: “Why did God create us?” My understanding of that has been to have fellowship with us, but that doesn’t sound complete in itself. Was he then lonely or incomplete? I promised to get back on it, and decided to blog about it at the same time.

I believe there are a few main reasons:

1. To glorify Himself. While there is much in existence, nothing glorifies God like man. Being made in his image, we are the pinnacle, the crowning glory of all his creation. Everything in creation glorifies him, but we, as people, are the only creation that actually reflect him.

2. For fellowship with us. Prior to the creation of mankind, there were angels, but still they were not created in God’s image. I am not sure what differences there are between angels and humans in terms of ability to communicate/fellowship though.

3. To demonstrate love. Perhaps as the very essence of love, God had an innate desire to exude love, and mankind was the best channel through which he has exhibited his love.

Do you have other thoughts? Post them in the comments!