So today I was perusing my blog list and ran across a blog by Rick Strahl (Dynamic Delegates with Expression Trees) which is referencing this post by Nate Kohari (Fast Late-Bound Invocation with Expression Trees).
I first just thought it was a very interesting post and thought, “wow, next time I have to do something requiring a ton of reflection method calls I’ll have to remember this”. I took off with the family for the day and sometime during the day the idea popped into my head about trying it to read property’s values (since they are actually just a method call under the hood). And I do have the perfect place to try and spike this idea…
One of the things I’ve done to the i4o library is try to reduce the calls to reflection whenever possible. And the only one left that I couldn’t get past was having to use the PropertyInfo’s GetValue reflection call to get at a value for creating the property index. This generally isn’t a big deal because we only have to do it once, however it makes the initialization of the IndexableCollection<T> fairly expensive.
So I decided to spike the idea of using the DelegateFactory from Nate’s blog to read an object’s property values. And below I’m including the project that I spiked to test this idea out.
It compares how fast we can read values out of an object’s property by first using reflection and second using the dynamic delegate.
If you read Ricks post above, he mentions how the dynamic delegate idea is a little reflection expensive up front. However it only happens once and I ran a couple of tests…(p.s. i noticed a small flaw in my timing, however I don’t want to re-zip & re-upload the spike… so if you see the timing flaw, cool…)
I noticed that after about the first 5000 reads the expense of the heavy up front reflection from DelegateFactory started to become negligent. And the more reads the more powerful the new reading strategy became. Take 1/5 million reads for example… normal reflection reading took .78 seconds, while the delegate reading took only .03 seconds which is a substantial increase.
After that I spiked it in the i4o project and proved that this could significantly increase the index build time of the an IndexableCollection<T>. I shelved the spike and am having Aaron take a look at the idea. We’ll see, I haven’t come up with any reasons why this could cause any issues, so it may be included…
___________________
Dissertation Sample
___________________
_____________________________
Dissertation Topics
Thank you for your response.
My Xml looks like
University1
Course1
Student1
Student2
University1
Course2
Student3
Student4
University2
Course1
Student5
etc...
XElement cimXml = XElement.Load(@"C:\Students.xml");
So, here I am grouping on Universities.
var universityGroupedElements = from ele in cimXml.Elements() group ele by ele.Name;
Now I want to add indexes to University, Course, Student elements to make queries faster.
Also, I want to group the courses in each university and also students in each course
Please help, in writing some code.
I would be happy to help you, but have a request and a couple questions...
1. I've realized my blog comments are starting to become a bit more of the i4o knowledge base than it deserves. Would you please ask your question over on the i4o Discussion board?
2. I'd like a little more detail. What does the xml structure look like? What does the linq query look like that returns your "grouped" data? What is it you are trying to index/search?
Thanks,
Jason
University
Course
Student
I have list of Students from a various Courses and various Universities. I used Linq for Grouping Universities and Courses of Students. Now, I have question how to implement Indexing on Students, Courses and Universities.
Please give the implementation details .
Thanks
Satya
For example:
Instead of writing this..
var f = indexedFileInfosFromDir.Where(fi => fi.Extension == "txt" && fi.IsReadOnly == true);
Try this..
var f = indexedFileInfosFromDir.Where(fi => fi.Extension == "txt");
var f2 = f.Where(fi.IsReadOnly == true);
(fi.Extension should be smaller group than fi.IsReadOnly)
From what I have tested,
It responses within a few milisecond.
I am happy with this.
Thank you for great work.
Please keep in mind that this library is very simple, and is not a complete Linq implementation.
It provides some great benefits in the scenarios it was designed for. And I know Aaron has some more improvements on the way.
If you create any patches for the project, we are happy to take a look at any improvements you can come up with.
Thanks again
From the 'Demoi4o' project.
If I change this below query
var studentsNamedAaronFromConstant =
from student in _testStudents
where student.FirstName == studentNameBox.Text
select student;
To
var studentsNamedAaronFromConstant =
from student in _testStudents
where student.FirstName.Contains(studentNameBox.Text)
select student;
It takes longest time!!
If this is what you were asking, then no it will not update the index.
You need to call the IndexableCollection<T>.Add/Remove and not it's base type Collection<T>.Add/Remove for the indexes to be updated.
This is however something we'd like to support in the future, which is why I checked in the failing test. We probably need to just implement ICollection<T> etc... and to fully support this.
I have a question.
If we have some changes on our base collection like add or delete, do we have to re create index or do some special steps?
Thanks,
Paiwan
I was excited when I saw your post and really like the syntax for adding properties to the IndexSpecification. Now, I guess I'll just wait until I can actually use it like that. From Aaron's blog, it sounds like he plans to introduce updates to the Where expression in the next release.
Thanks for your work on this library. I'm hoping this will make it possible to replace a subsytem of our app which currently relies on looking up factors in large in-memory sets of data by using XML. It loads large XML documents into memory and builds XPath queries to get to individual factors that it needs. The XPath queries themselves are pretty past, but I'm trying to use Linq-to-XML to project the XML into collections of strongly type objects that I can query with Where lambda's instead. It works well, but when the collections are large enough the queries are too much slower than the XPath version. I'm hoping the IndexableCollection will make it possible.
Kevin
I was just illustrating the power of the IndexSpecification. Unfortunately the extension methods used to evaluate the expression need some work.
Thanks for this post. I got to it from the i4o home page on Codeplex. I noticed in the discussions that someone mentioned the fact that if you have multiple expressions in your Where expression, that it won't use the index.
So does that mean, your example showing multiple properties being added to the index doesn't really help if you wanted to search your collection on both of the properties in the index?
For example, would this code use the index?
var f = indexedFileInfosFromDir.Where(fi => fi.Extension == "txt" && fi.IsReadOnly == true);
From my testing, it appears that no benefit is achieved from using an IndexableCollection vs. a regular List in this case. Seems like there's no point in using the IndexableCollection if you need more than one property in the index. Am I missing something or is that correct?
Thanks,
Kevin