PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 23rd of April, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[02:51:13] <fullstack> How would someone go about using Amazon AutoScale with Replica set ?
[03:37:38] <GothAlice> fullstack: Not sure about the specific technology; scaling a replica set horizontally only gives you additional read access for queries already being directed at secondaries. If you need to scale writes, you also need to scale in replica-set-sized multiples and configure multiple sets in a sharded array.
[03:38:06] <fullstack> Thanks
[03:38:12] <fullstack> Is that what compose.io does for $$$?
[03:45:59] <GothAlice> fullstack: joannac was kind enough to remind me of this article: http://askasya.com/post/canreplicashelpscaling
[03:46:50] <GothAlice> fullstack: In general, sharding (with very well-chosen sharding keys) is the way to go in terms of scaling.
[03:47:46] <GothAlice> Or: optimization. Measure things, find out what's taking up a lot of room, and refactor. http://www.devsmash.com/blog/mongodb-ad-hoc-analytics-aggregation-framework demonstrates how structuring your data can affect both query performance and on-disk size, as a dual comparison.
[03:53:46] <GothAlice> fullstack: And yes, that's what compose and other "cloud NoSQL" services do.
[04:43:35] <zhulikas> hey fellas, got a question
[04:44:02] <zhulikas> I've set up query logging and can see my latest query like this:
[04:44:02] <zhulikas> db.system.profile.find({ "query.internal_resource" : "Patient" }).limit(1).sort( { ts: -1 } ).pretty()
[04:44:39] <zhulikas> now this is part of the result I get
[04:44:40] <zhulikas> http://pastebin.com/ZbWCRjfu
[04:45:01] <zhulikas> I do indeed query with "given" = "Vardenis" from within my application
[04:45:13] <zhulikas> but the actual query applied is name.given
[04:45:21] <zhulikas> so is this some sort of mongodb collection index?
[04:45:36] <zhulikas> (i'm only using open source application and trying to extend some parts of it)
[04:48:04] <joannac> zhulikas: pastebin the whole document
[04:49:09] <zhulikas> http://pastebin.com/nU1QtHJR
[04:50:08] <joannac> zhulikas: okay, the "query" matches the "filter" stage
[04:50:28] <zhulikas> sorry, but what does that mean?
[04:50:31] <joannac> there's no reference to "given" or "name.given"
[04:50:35] <joannac> so what's your question?
[04:50:48] <zhulikas> oh, yes, this time i used "family" which queries on name.family
[04:50:56] <joannac> no it doesn't
[04:51:02] <joannac> it queries on the field "family"
[04:51:21] <zhulikas> then how does it return any results O.o
[04:51:30] <joannac> it doesn't
[04:51:33] <joannac> look at the results
[04:51:35] <zhulikas> as there is no "family" field on Patient
[04:51:48] <joannac> wait, it does
[04:51:52] <joannac> run the query again
[04:51:53] <zhulikas> yeah, one result
[04:52:20] <joannac> run the query in the shell then, assuming you have the same dataset
[04:52:26] <joannac> and see what the result is
[04:52:48] <zhulikas> http://pastebin.com/ZNA62Mfc
[04:52:54] <zhulikas> oh, in the shell it doesn't work
[04:53:03] <joannac> why not?
[04:53:12] <zhulikas> because then i'd be searching on "family" field :)
[04:53:18] <joannac> and?
[04:53:21] <zhulikas> and i assume "family" here is some sort of index for "name.family"
[04:53:25] <joannac> no
[04:53:53] <joannac> run the query, in the shell, and tell me what the output is
[04:53:59] <zhulikas> well this gives no results
[04:54:01] <zhulikas> > db.resources.find({ "family" : "Pavardenis" })
[04:54:04] <joannac> the actual query, not the query on system.profile
[04:54:13] <joannac> dude
[04:54:25] <joannac> run it on the database fhir and the collection searchIndex
[04:54:41] <zhulikas> oh, searchindex, mkay
[04:54:42] <joannac> sorry, searchindex. not capital letters
[04:55:32] <zhulikas> yeah, i got a result
[04:55:46] <zhulikas> > db.searchindex.find({ "family" : "Pavardenis" })
[04:55:54] <joannac> so there you go
[04:56:10] <zhulikas> super weird though, db.searchindex.find() doesn't show that result O.o
[04:56:20] <zhulikas> what the heck is going on
[04:56:27] <joannac> because you don't get the whole collection
[04:56:34] <joannac> you get a batch (default 20)
[04:56:55] <zhulikas> okay... so, searchindex. Is that actually a "search index" ?
[04:57:03] <zhulikas> like - can I configure it with more indexes?
[04:57:05] <joannac> No
[04:57:16] <joannac> No, that's the name of the collection
[04:57:22] <joannac> you can name your collection whatever you want
[04:57:34] <joannac> yes, you can create more indexes for that collection
[04:58:37] <zhulikas> interesting. searchindex looks like it holds flat objects which are otherwise stored in "resources" collection
[05:00:06] <zhulikas> so, i'm thinking...
[05:00:21] <zhulikas> is it the application which stores object in searchindex and resources collections at the same time?
[05:00:32] <joannac> yes
[05:00:35] <zhulikas> flattening the object and storing in searchindex
[05:00:42] <joannac> it's your application
[05:00:45] <zhulikas> so it's easier to search without going into nested object structure
[05:00:52] <zhulikas> and then matches structures by their _id field
[05:01:22] <zhulikas> okay, meaning i need to figure out a way to extend that moment when an object in application is passed on to searchindex collection and make sure my other fields for indexing are added there
[05:01:32] <zhulikas> this approach is rather brilliant
[05:01:45] <zhulikas> okay, thanks a lot, man
[05:15:52] <zhulikas> yay got it working!
[05:16:20] <zhulikas> i'm trying to extend this application to support more search parameters so i just need a trigger to reload searchindex collection whenever i add a new search parameter
[05:37:34] <sahilsk> greetings
[05:38:40] <sahilsk> I've set quotaFiles to 2 and trying to fill the database. It created one 64MB file and after a while throw error on insert saying quotaLimit exceeded. Why so? It created only one datafile yet??
[05:40:01] <sahilsk> Second confusion is. When i create a new collection under the same database, it let me create it. And it also created a second datafile (128MB). I'm able to insert data in this new collection. Does quotaFiles has to do with Collections? One datafile for one collection?
[05:41:26] <joannac> sahilsk: what version?
[05:41:32] <sahilsk> 3
[05:44:13] <sahilsk> now it has created a third datafile of size 256M , even when i set quotaFiles to 2 only. Why the hell mongodb is not enforcing quota here ?
[05:45:25] <sahilsk> I'm using one liner for filling up data : while( i < 10000000) { db.dummyData1.insert( { a: 2, b: "sodfndufd"}) ; i++ };
[05:45:51] <joannac> weird
[05:45:56] <joannac> file a SERVER ticket
[05:49:49] <joannac> actually, I think it's a combination of 2 known issues
[05:50:16] <sahilsk> i'm filing ticket joannac.
[05:51:07] <joannac> sahilsk: you can if you want, but it'll get resolved as a dupe
[05:51:21] <sahilsk> ohk....
[05:51:31] <sahilsk> i can't proceed further with this blocker.. :(
[05:51:34] <joannac> https://jira.mongodb.org/browse/SERVER-5136 and all the tickets it's linked to
[05:51:39] <joannac> sahilsk: what's the blocker?
[05:52:32] <sahilsk> i want to impose quota limit .
[05:53:04] <sahilsk> and here my test cases are showing weird behaviour
[05:54:23] <joannac> for what purpose?
[05:55:36] <sahilsk> shared hosting
[05:56:14] <joannac> okay, so there are some off by one errors. set it down to 1 or maybe 0
[05:56:54] <sahilsk> what errors? where to change it
[05:57:17] <joannac> change it in the command line options or the config file
[05:57:43] <sahilsk> oh.. you mean quotaFiles .. ok
[05:57:49] <sahilsk> testing AGAIN
[05:58:56] <sahilsk> joannac: what quotaFiles 0 means?? not able to create collection?
[05:59:54] <joannac> no. test it. i think it'll mean "no more files after the preallocated ones"
[06:00:21] <sahilsk> lol... right ..give me a minute
[06:00:46] <sahilsk> I've noprealloc = true , btw
[06:01:19] <sahilsk> for small databases, it's suggested params. i hope this won't interfere
[06:05:20] <sahilsk> With quotaFiles= 0 joannac: , it let me create database & collection. I"m inserting data into it
[06:06:16] <sahilsk> but only let me insert 33 records only :-/
[06:06:47] <joannac> sahilsk: okay?
[06:08:45] <sahilsk> now i created another collection. and it's letting me insert there :-/
[06:13:03] <sahilsk> On new collection it Let me insert 62 records , thereafter error of quota exceeded
[06:14:17] <sahilsk> Now increasing quotaFiles to 1 and testing same
[06:20:03] <joannac> probably something to do with how the docs are actually stored on disk
[08:11:23] <Constg> Hello, I've upgraded MongoDB to 3.0.2 using MMS, but when I'm connecting to the shell, it still says MongoDB shell version: 2.6.9. Any idea why? Shouldn't it be 3.0.2 too?
[08:13:44] <sahilsk> Constg: make sure you're checking version of right thing. Mong client or Mongo Server??
[08:17:04] <Constg> well, I go on the server, and type mongo (with the port). I can see I'm on the Primary per example
[08:45:02] <sahilsk> Constg: try this as well : db.version()
[08:45:25] <Constg> 3.0.2
[08:45:28] <sahilsk> if this shows you old version then contact MMS support . they solve ticket quickly. :)
[08:45:54] <sahilsk> Constg: i knew. you are running old mongo client and version you see was of mongo client only.
[08:45:54] <Constg> so everything seems fine... :)
[08:45:56] <sahilsk> :)
[08:46:08] <Constg> thank you sahilsk
[08:46:15] <sahilsk> np
[09:33:03] <isale-eko> http://stackoverflow.com/questions/29819224/querying-for-last-7-or-30-days-date-range-with-mongodb-mongoid-rails
[09:54:06] <spuz> hi, I'm trying to use the mongoshell to execute a find command but I'm not having any luck
[09:56:45] <spuz> I've tried 'mongo host/database --eval 'db.Collection.find(_id:"foo")'' but it simply prints the command that I gave it rather than the output
[09:56:48] <spuz> any ideas?
[09:57:08] <spuz> sorry, that should be: 'mongo host/database --eval 'db.Collection.find({_id:"foo"})''
[10:54:10] <mtroy> export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/opt/boost/lib
[10:54:10] <pamp> Hi guys
[10:54:45] <pamp> im Writing 2 millions docs from c# driver
[10:56:44] <pamp> how do I know if the mongodb is using all the processor cores
[10:56:45] <pamp> ?
[11:05:11] <sahilsk> run top command and check the load yourself pamp
[11:06:24] <pamp> sahilsk Im on windows server machine, whats the equivalent comand on windows?
[11:06:33] <pamp> thanks for the answer
[11:07:01] <sahilsk> (die window user , die..... ) sorry... don't know :(
[11:14:20] <pamp> I prefer linux but ... but the company wants to windows
[11:14:22] <pamp> :/
[11:15:03] <Nomikos> does this help? I don't know what powershell is exactly, but http://superuser.com/questions/176624/linux-top-command-for-windows-powershell
[11:16:11] <den21> hey guys
[11:16:27] <den21> I'm looking for something like the 'InsertFlags.ContinueOnError' for InsertBatch of previous c# mongo driver. Anyone knows here?
[11:16:37] <den21> Using c# driver, I wanted to insert multiple documents(index set on a key). I'm using the 'InsertManyAsync', but when a duplicate based on index occurs, I get an error "A bulk write operation resulted in one or more errors."
[11:16:43] <den21> I can't find anything in documentation that will just continue inserting the rest of the documents, ignoring the duplicate/error.
[11:18:31] <den21> await collection.Indexes.CreateOneAsync(Builders<BsonDocument>.IndexKeys.Ascending("u"), new CreateIndexOptions { Unique = true});
[11:18:44] <den21> then inserting:
[11:18:45] <den21> await collection.InsertManyAsync(xx /*{"u":"http://test1.com"}, {"u":"http://test2.com"}, {"u":"http://test1.com"}*/);
[11:24:56] <pamp> Nomikos , thanks,
[11:25:08] <pamp> I already tried this script
[11:25:36] <pamp> but I cant know what means the values
[11:25:55] <pamp> and does t give me what i need
[11:26:40] <pamp> nkow how many cores are been using for Bulk writes operations
[12:08:19] <amitprakash> Hi, is it possible to see document counts vs aging buckets using aggregations?
[12:08:36] <amitprakash> i.e. x documents < 1 day old, < 3 days old, 3+ days old
[12:28:24] <deathanchor> amitprakash: as a single aggregation command? I don't think so, but if you indexed by some time you can just do a match on the buckets of time you want.
[12:41:49] <amitprakash> deathanchor, I've indexed on a field containing an ISODate. However, I don't see how I can group it to buckets of [, now - 24h], [-24h ,- 72h], [-72h, ]
[12:42:52] <amitprakash> deathanchor, you mean 3 separate queries for each bucket?
[12:45:13] <deathanchor> yeah.
[12:45:36] <amitprakash> :(
[12:47:46] <XThief> hi, im new to mongodb and im configuring a server for production, do I get a big benefit if I use SSD instead of a regular 7200 RPM hardrive when all my queries are really simple and are indexed?
[12:57:39] <deathanchor> hey is there a char for comments in the mongo shell?
[14:22:06] <pamp> I guys
[14:22:11] <pamp> Hi*
[14:22:17] <GothAlice> Allo allo.
[14:22:54] <pamp> Its normal take >76 minutes to write >5 500 000 docs?
[14:23:53] <pamp> only 1207 docs per second
[14:24:04] <pamp> with C# driver
[14:24:30] <fxmulder> so I started a new replica member and its pulling over data, when I first started it was doing about 2000 objects per minute but now looking at it a couple days later it is doing about 100 objects per second, any idea why that would slow down so much?
[14:24:37] <fxmulder> 100 objects per minute that is
[14:25:04] <pamp> avgObjSize 2358
[14:25:26] <GothAlice> pamp: How were you loading those records?
[14:25:35] <fxmulder> "avgObjSize" : 31072.1376588041
[14:26:39] <fxmulder> at this rate this thing will never finish
[14:28:15] <pamp> GothAlice http://prntscr.com/6x8mhj
[14:28:48] <pamp> first I querying a collection a than insert all docs into another
[14:29:57] <GothAlice> pamp: You may be dogpiling your primary. Unacknowledged write concern also means you're ignoring all possible errors that could arise.
[14:30:27] <GothAlice> pamp: By "dog pile" I'm referring to the accumulation of operations as the disk IO can't keep up with network IO.
[14:32:26] <likarish> I'm working on designing a schema and running into a problem. Here's what a document currently looks like: {_id: xxx, entries: [{_id: 1, ver: 2}, {_id: 2, ver: 10}]}
[14:33:15] <likarish> I'd like to be able to update entries so that if entries._id is not found, then it will insert an element. If the element exists, then it would set the new ver.
[14:33:54] <likarish> From what I've read, that doesn't sound possible. Any suggestions on how I could change this schema for this operation?
[14:36:12] <likarish> How it's designed now, I'd need to be able to upsert an element in the entries field.
[14:44:03] <likarish> Think I'll end up doing something like the answer G-Wiz gives. http://stackoverflow.com/questions/8871363/upsert-array-elements-matching-criteria-in-a-mongodb-document/21347039#21347039
[14:46:12] <GothAlice> likarish: As a note, if you do that, you lose any effective ability to use indexes on that nested data.
[14:46:24] <GothAlice> That can be a bit of a downer.
[14:47:15] <pamp> GothAlie, yes, I think my problem is Disk IO
[14:47:17] <likarish> hmm, good point.
[14:47:50] <likarish> In design phase, so any alternatives I could look at?
[14:47:51] <GothAlice> pamp: Using a higher write concern (i.e. confirmed-journal) will make your insertions operate more lock-step with disk IO.
[15:08:33] <juliofreitas> Hi everyone! I saw the "TimeSeries data" at MongoDB Webinars but I don't know how to create the collection with times, how to index by time and when I have the value, I'll update the field. Could anyone send me a tutorial?
[15:08:33] <juliofreitas> My scenario:
[15:08:33] <juliofreitas> I've a file that each 15 minute is updated. I'll parse and update my collection. The schema is: 2015-04-15T00:15:00Z (time), city, state...
[16:14:58] <deathanchor> is there a mongo command for checking if a db exists without creating it?
[16:18:51] <fxmulder> found the problem, elasticsearch was using a large load of memory and causing things to swap on one of the machines
[16:20:09] <fxmulder> back to around 2000 objects per second
[16:27:14] <derbie> Hey guys! A little mongo / mongoose issue: Considering there's a collection cars and another one logs. I have an array of unique carIDs that i need to query the logs collection and get the latest entry for each carID
[16:27:25] <derbie> What's a better way than to loop through each ID and query the DB for each car?
[16:32:56] <pamp> Its possible separate index and db's in different drives?
[16:36:27] <derbie> Considering an array of UserIDs and a collection `logs` with structure (_id , userid, date, aValue); What's the most efficient way to query the database for most recent (date) entry for each user in the array?
[16:59:20] <bdrewery> I realize there's no transactions. I want to run an update that may damage data. Is there a good way to do this and verify the data before it is finally committed, without just shutting down mongod and "copying" the db file first?
[17:00:43] <cheeser> do a mongodump, mongorestore to a test db. test there.
[17:01:22] <bdrewery> thanks
[17:48:51] <pamp> Its possible separate index and db's in different drives?
[17:58:49] <cheeser> pamp: there's this http://docs.mongodb.org/manual/reference/configuration-options/#storage.directoryPerDB sort of along those lines
[17:59:09] <cheeser> http://docs.mongodb.org/manual/reference/configuration-options/#storage.wiredTiger.engineConfig.directoryForIndexes
[19:21:16] <ngl> I am told this is "done!", but it is not in my "templates" gfs bucket. What is wrong?
[19:21:20] <ngl> mongofiles -d svm -c templates put master-flash.sh
[19:23:12] <ngl> Please? I looked up and pointed at the damn Canadian geese that are pecking cars here and my neck snapped and I can't look right so it would be really nice if somebody could help a brother out so I can go try to get fixed.
[19:37:46] <Siecje> What is the recommended way to add attributes when a relationship is added?
[19:46:47] <ToeSnacks> I have been draining a shard in my cluster for 3 days and the chunks count has not gone down at all, how can I verify Mongo is actually draining and not in a hung state?
[19:47:06] <ToeSnacks> remaining chunks I mean
[20:11:24] <yhager> I've dropped a collection, but it's still there - taking up disk space in /var/lib/mongodb, and counted in dbStats - but not shown on 'show collections'. how can I get truly get rid of it?
[20:25:11] <gswallow> I'm running through some tests because our lead dev said to do it, but I can't find any reason *why* I'm testing this other than anecdotes in Google Groups. Why does mongoexport —eval '{ $and: [ { "_id": { $exists: true } }, ….' appear to run so much faster than mongoexport using indexes?
[20:48:20] <tejasmanohar> hey guys
[20:48:26] <tejasmanohar> anyone familiar w/ connecting to mongo from console?
[20:48:51] <tejasmanohar> i have the URI mongodb://something:somethingDEV@ip/something-dev
[20:50:21] <tejasmanohar> when i do `mongo {{ that URI }}` it says invalid port number something:somethingDEV@ip in connection string {{ whole string }}
[20:52:36] <cheeser> mongo --user something --password something --host ip something-dev
[20:52:58] <tejasmanohar> oh you cant use the URI, cheeser ?
[20:53:40] <cheeser> doesn't look like you can from the cli
[20:54:23] <tejasmanohar> cheeser: unrecognized optoin --user?
[20:54:52] <tejasmanohar> mongo --user lol --password lolol --host XXX.XX.XX.XXX
[20:55:32] <cheeser> mongo -h
[21:28:18] <miceiken> So I've seen you "create" a database by just using it. Does it apply to the same when connecting mongo://localhost/<db>?
[21:33:32] <cheeser> yes
[22:34:03] <pjammer> evening. Is there a method equivalent to rs.uninitiate() for those of us who add more than one rs.initiate() in a replica set?
[22:52:14] <blizzow> Can I use mongodump to backup all of my collections/dbs via a mongos instance for a replicated sharded cluster? (I don't particularly care about the local DBs on each replica set.)
[22:58:16] <cheeser> i believe so yes.
[22:58:27] <cheeser> how much data?
[22:58:44] <blizzow> 670GB.
[22:58:54] <blizzow> 10GBe network.
[22:58:55] <cheeser> that would take quite a while...
[22:59:56] <blizzow> My other question is, How bad will doing so lock my DB up?
[23:00:38] <blizzow> I'm using 2.6.9
[23:01:04] <cheeser> it wouldn't totally lock it, no.
[23:01:12] <cheeser> but it's not something you'd want to do regularly.
[23:03:13] <cheeser> (disclaimer: i'm a mongo employee) but you really should look in to mms backup
[23:03:38] <cheeser> or something like it. i don't know of any other backup solutions that don't require hosting with the service, too.
[23:07:49] <GothAlice> I can highly recommend it.
[23:07:51] <blizzow> If I remember correctly, mms backup is $5/GB/month of churned data. IWe churn ~200+GB /day. :/
[23:08:28] <blizzow> Not even accounting for the data transfer costs in/out of my datacenter.
[23:08:44] <GothAlice> At work we exclude our logs, pre-aggregated analytic data, and any capped collections from backup. This saves a lot on churn.
[23:08:54] <GothAlice> All of those things can be rebuilt.
[23:09:36] <cheeser> First 1GB per replica set free
[23:09:37] <cheeser> and then $2.50 / GB / month
[23:09:43] <cheeser> or $30 / GB / year prepaid
[23:10:18] <GothAlice> cheeser: Still amusingly expensive for terabyte-level deployments. ;)
[23:10:29] <cheeser> agreed.
[23:10:36] <cheeser> but still, give us your money. :D
[23:10:46] <GothAlice> An average month would cost me $15,000 on my home dataset. XP
[23:11:02] <cheeser> will that be cash or credit?
[23:11:12] <GothAlice> You can have my next born, maybe?
[23:11:15] <blizzow> *pennies
[23:11:56] <cheeser> i have two of my own... though the company could always use a good intern...
[23:45:40] <GothAlice> cheeser: https://www.youtube.com/watch?v=Wmx6Q0YLH8A < related
[23:49:06] <GothAlice> Stitch 'em tight. *nods*
[23:53:37] <ToeSnacks> I have been draining a shard in my cluster for 3 days and the remaining chunks count has not gone down at all, how can I verify Mongo is actually draining and not in a hung state?