PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Saturday the 15th of September, 2012

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[03:12:32] <bendman> does anyone have experience with mongodb on suse?
[03:57:10] <doubletap> why does my version of mongo cli not allow me to connect to remote databases?
[04:00:14] <doubletap> It is very odd. my instance of mongo cli does not have any options for username or password. Why would that be?
[04:00:51] <doubletap> what am I missing?
[05:44:40] <bennukem> hi
[05:45:31] <bennukem> I would like to force a user only use him database. How do that ?
[05:45:46] <bennukem> --auth, ok, addUser admin, ok
[05:46:05] <bennukem> use newdatabase ok , addUser finaluser
[05:46:24] <bennukem> but the final user can create an another database. How block that ?
[06:47:55] <abstrax> how do i select data where created_at = last 7 days ?
[07:16:45] <ron> abstrax: obviously, that depends on the created_at data type.
[07:18:58] <abstrax> ron: if i am using new MongoDate() from PHP, that stores ISODate ?
[07:19:32] <ron> abstrax: I dunno. ask NodeX. or Derick.
[07:19:38] <abstrax> btw, i am really confused regarding one thing atm. I am building a chat system, now what if the user changes his name from profile later ? when he views the log, it'll not contain the new name. i dont think i can do JOINs in mongodb.
[07:22:48] <ron> you can't, but user ids should normally be unique and fixed. changing the name, shouldn't change that.
[07:23:51] <abstrax> but then, the onyl way i can think is to query once for every username from application
[07:24:11] <abstrax> or is there somethbing better ?
[07:26:14] <ron> not sure I follow, sorry.
[07:26:51] <abstrax> ron: if i store user_id against each message, and have to show the name in the logs, how would i do it ?
[07:27:05] <abstrax> fetch all messages and query once for each user_id to get user full name ?
[07:27:35] <ron> abstrax: store both.
[07:27:44] <abstrax> yes, but what if user changes his name later ?
[07:27:47] <abstrax> some people do that :P
[07:28:06] <ron> yes, store the user name and the user id for each message.
[07:28:40] <abstrax> and when the user name is changed, how will i show the updated name in logs ?
[07:29:18] <ron> define logs. you assume I know everything about your application and use cases.
[07:29:29] <abstrax> chat log, its like old messages.
[07:29:51] <abstrax> like if i chat with someone today, then when i come back after 5 days, i can see the old messges, aka, logs
[07:30:09] <ron> yes, but you'd see the old user name, not the new one.
[07:30:30] <abstrax> exactly, how can i show the new one ?
[07:31:43] <ron> if it's that important to you, you can just update all entries if the user changes its name.
[07:31:55] <ron> however, that's counterproductive in my opinion.
[07:32:03] <ron> since, let's say here, I type:
[07:32:05] <ron> abstrax: hey man!
[07:32:21] <ron> and then you change the name to aoedmaodmaodmi
[07:32:41] <ron> it's not like the log of my message would change to "aoedmaodmaodmi: hey man!"
[07:32:53] <ron> and that would make reading logs more difficult. just my 2 cents.
[07:33:32] <abstrax> hmm, you are right.
[08:53:31] <Gargoyle> Anyone wanna help with some indexes?
[08:53:53] <Gargoyle> Collection has these:- http://pastie.org/4724377
[08:54:21] <Gargoyle> Query with explain:- http://pastie.org/4724380
[08:57:42] <Gargoyle> The query doesn't seem to want to use the index, even though a suitable one exists.
[09:03:19] <Gargoyle> ahh. we can't do case insensitive searches!
[09:03:34] <Gargoyle> and use an index!
[09:49:08] <Gargoyle> Or thoughts on why this one might be taking so long:- http://pastie.org/4724507 ?
[11:57:32] <Gargoyle> if I am searching and sorting, does it help if the sort column is in the same index as the search columns?
[11:59:59] <darklrd> Is there any good resource on mongodb indexing? as in how to effectively create indexes?
[12:17:34] <darklrd> if I have multiple collections in my database based on time (say per month)
[12:17:52] <darklrd> then would mongodb store indexes of all such collections in RAM
[12:18:12] <ron> I imagine it does so based on access patterns, but it will definitely try.
[12:18:21] <darklrd> or would it store only the once which are highly accessed?
[13:03:28] <Gargoyle> I need to drop and recreate a bunch of indexes. I have written the PHP code, but halfway through, the script dies with cursor timeout.
[13:04:03] <Gargoyle> apart from some manually placed sleep() calls, is there a better way of waiting for one build to complete before moving onto the next?
[13:14:14] <darklrd> vsmatck, ron, are you there?
[13:14:23] <ron> I am. Are you excited?
[13:14:36] <darklrd> I had a quick question
[13:15:27] <darklrd> ron, if I storing chat msgs in mongodb, and if I want to avoid sharding as long as possible
[13:15:43] <darklrd> then would it make sense if I make collections per site per month
[13:16:26] <darklrd> because users usually see recent messages so eventually older msgs will be used less
[13:16:31] <ron> not something I'd do, but it could be a proper solution. it really depends on what kind of queries you have and so on.
[13:16:57] <darklrd> as I understand, sharding is needed when your index mapping no longer can be loaded in RAM
[13:16:58] <ron> it will most likely make the business logic of your application more complicated.
[13:17:13] <ron> not the only reason to shard, but sure.
[13:18:18] <darklrd> so if I reduce the index space by dividing in chat msgs in collections (per month), then would it help? would mongodb wisely load index mapping corresponding to recent msgs only in RAM?
[13:18:32] <darklrd> because its implementation is very complicated
[13:19:15] <ron> I have no idea, sorry.
[13:19:32] <darklrd> my query only includes to retrieve most recent chat msgs for different users corresponding to a particular site
[13:19:37] <darklrd> ok, ron, thanks for your time!
[13:20:11] <darklrd> any idea who would be able to advice me on this?
[13:26:23] <dcrosta> darklrd MongoDB will tend to only have in memory those things that it needs (i.e. that it has used recently or frequently). if the "older" part of your collection or index is not queried often, it won't be in RAM (assuming your index includes a timestamp-like field at the start)
[13:27:28] <darklrd> dcrosta, thank you so much for replying, so even if I dump all chat msgs in a single collection for a particular site it won't matter then?
[13:27:56] <dcrosta> well there are other advantages to using multiple collections -- you can "age out" old documents with a drop(), for instance
[13:28:29] <ron> that assumes you want or need to drop old data.
[13:28:42] <dcrosta> right.
[13:28:52] <ron> that smells very RDBMS-y though ;)
[13:29:00] <darklrd> dcrosta, i plan to use a combined index - (from_user, to_user, timestamp).. u said to use timestamp in beginning, but in my case usually i would need to select recent msgs based on users having conversation
[13:29:34] <dcrosta> darklrd what is "recent"? are you always querying including the time (or a time range with $gt, for instance)?
[13:30:31] <darklrd> dcrosta, recent means last 50 messages, and then more "paging", so i was planning to use time index in descending order
[13:32:30] <dcrosta> i see. in that case, an index on (from_user, to_user, timestamp) seems more appropriate. you can imagine how that index would look as a tree -- all the documents from a given user will be roughly adjacent in the index, sub-sorted by all documents from a given user to another given user, and finally sorted by timestamp. what this means, as far as memory, is that if you are often only considering the first 50 documents for any (user, user)
[13:32:30] <dcrosta> sorted by timestamp, you will only have a smattering of the memory pages that make up the index loaded, even though they are not contiguous in the data files
[13:32:41] <dcrosta> (i hope that makes sense -- it's an easier concept to draw than to explain)
[13:33:35] <darklrd> dcrosta, so mongodb would follow this approach in cased of combined indexing as well?
[13:33:40] <darklrd> dcrosta, I see
[13:39:54] <darklrd> dcrosta, thanks a lot for providing your opinion
[13:40:06] <dcrosta> happy to help
[13:40:38] <darklrd> i have another quick question :)
[13:42:50] <darklrd> I have read that mongodb can easily billions of documents (but I don't have any experience in this regard) The only problem seems to be RAM corresponding to index loading as disk space is very cheap compared to RAM.
[13:44:47] <darklrd> Have any come across any presentation/video discussing this?
[13:48:06] <dcrosta> RAM is a limiting factor in regards to the "working set" (this is what I was talking about with frequent/recent documents). obviously more is better, but if you are only ever accessing the last days' worth of documents, it might not matter that you have a year's history, since they won't ever be loaded in RAM
[13:48:20] <dcrosta> let me see if i can find a presentation about how this all works, it's good to understand the internals
[13:48:46] <darklrd> I see, thank you so much!
[13:53:03] <dcrosta> http://www.10gen.com/presentations/MongoNYC-2012/storage-engine-internals might go into greater depth than you care to learn, but it will cover the topic well
[13:53:09] <darklrd> so, in this regard, corresponding to RAM, if I load last 50 msgs using combined index (from_user, to_user, timestamp), then it won't load all msgs from_user, followed by all the messages b/w from_user and to_user, in RAM? It will use the timestamp as well, am I correct?
[13:53:50] <darklrd> (in terms of "working set")
[13:53:59] <darklrd> Thank you so much!
[13:54:56] <dcrosta> so it depends on your query -- but assuming you do something like db.messages.find({from: "user1", to: "user2"}).sort({timestamp: -1}).limit(50), then that should be able to use the index efficiently, and only load into memory those documents it needs to return
[13:55:52] <dcrosta> and it will only have to load into memory the index pages corresponding to that portion of the index that you actually use. even if those two users have 10,000 other messages, it won't have to consider them
[13:56:20] <darklrd> dcrosta, cool, thank you so much!!! :) I will go through the presentation!
[13:56:40] <darklrd> dcrosta, thank you so much for your time and patiently explaining me!
[13:56:49] <dcrosta> happy to help. there's lots of other good presentation content at www.10gen.com/presentations (including some by me -- but don't watch those)
[13:57:04] <darklrd> wow! awesome!
[13:58:24] <darklrd> initially i started with a very bad implementation, I was using collection_user1_user2 per "site" and I quickly hit the number of collection limit, thanks to vsmatck and ron, they helped me out then
[13:59:21] <darklrd> I am learning gradually, thanks for your time and providing video links!
[13:59:23] <dcrosta> in general, unless you know you have a specific use case that benefits from multiple databases or multiple collections, i tend to favor (and recommend) using a simpler schema
[13:59:34] <darklrd> I see
[14:00:00] <dcrosta> it simplifies your application considerably.
[14:00:28] <dcrosta> and usually performs about as well, with proper indexes
[14:00:32] <darklrd> Actually, I was apprenhensive that there would be so many messages (in billions) gradually, so I thought divide it among collections and databases, I should have researched more.
[14:00:40] <darklrd> Yes, I agree.
[14:00:44] <dcrosta> obviously i'm generalizing here. there are exceptions. but it's good to start simple
[14:01:19] <darklrd> ok
[14:02:03] <dcrosta> http://www.10gen.com/presentations/mongosv-2011/schema-design-at-scale is probably another good one to watch. it actually covers specific schemas that sound similar to what you're working with
[14:02:41] <darklrd> awesome :D , I will go through this one too, thanks again!
[14:09:29] <Derick> Which of the two of "CAP" do you think MongoDB implements?
[14:11:58] <ron> CP
[14:14:37] <ron> though like many nosql dbs, it can probably be configured otherwise.
[15:00:53] <aboudreault> I see that a common practice is to use the *name* of something as _id rather than a uuid. This is probably *only* good for a object name that can't change. ie. a username. am I right or I'm missing something about how rename things globally?
[15:04:14] <noordung> aboudreault, You can use anything with _id. Since there is an implicit index on _id, you may want to reconsider what sort of data you put into _id.
[15:05:35] <aboudreault> noordung, yes, but take in example that I create a group, which is "My News Group". which is referenced in the users collection (membership). after 2 weeks, the owner fixes the typo and rename the group "My New Group"
[15:05:42] <aboudreault> this breaks things, isn't?
[15:06:35] <aboudreault> *break* might not be the appropriate term.. but all the embbed related are brokon somehow.
[15:06:39] <noordung> aboudreault, yes, I wouldn't advise (is it at all possible?) to change _id values...
[15:07:24] <aboudreault> ah.. I haven't tried to be honest. I'm currently reading... this would answer my question then :)
[15:08:22] <noordung> aboudreault, in any case, _id values are implicitly unique...
[15:09:17] <aboudreault> Ok.
[15:10:39] <dcrosta> _id is immutable, if you need to change a document's _id, you need to create a new document and delete the old
[15:12:48] <aboudreault> dcrosta, thanks this answer my initial question and make sense.
[20:07:08] <Zelest> What's a good way to rank/handle FTS in mongodb?
[20:07:16] <Zelest> Like, is there any nice example-code for that somewhere?
[20:27:34] <Venemo> hi guys
[20:28:18] <Venemo> is it possible to store binary data in a field in Mongo?
[20:28:45] <Venemo> I'm looking for something like SQL Server's VARBINARY columns
[20:55:57] <Vile> Hi!
[20:56:21] <Vile> Is it possible to reduce sharding priority?
[20:57:16] <Vile> i.e to make sure that chunks re-allocation happens in background, while queries run at nearly normal speed?
[21:20:11] <addisonj> is anybody here deployed in ec2 across multiple regions? just trying to figure out the best way to manage traffic between the two regions
[23:29:19] <thepumpk_> Hi. I have a replica set with 4 nodes, one arbiter, one primary and two secondaries. After upgrading from 2.07 to 2.2 all my nodes are in Secondary State(except the arbiter).
[23:29:21] <thepumpk_> Any ideas?
[23:32:04] <thepumpkin> I tried assigning priority to my former primary but no luck, still stuck in secondary. I have now removed all my other nodes, now there is only the arbiter and the secondary that is supposed to be primary. still not working.
[23:43:13] <thepumpkin> I had forgotten about the arbiter since I don't see it in MMS, upgrading the Arbiter to 2.2 solved the issue.
[23:46:17] <thepumpkin> everything is up now, MMS is reporting everything wrong, I guess I have to update the MMS agent too.